United  States  Department  of  the  Interior 

BUREAU  OF  LAND  MANAGEMENT 

DENVER  SERVICE  CENTER 

DENVER  FEDERAL  CENTER.  BUILDING   SO 

DENVER.  COLORADO     8022S 


IN  R7iS»2.Y  REFER  TO 

7300    (D-471) 


■8'  V  TV  1*1 


vMay  10,    1985 


Information  Bulletin  No.  DSC  85-10^ 
To:       All  Field  Officials 


From: 


Service  Center  Director 


Subject:  Range  and  Watershed  Monitoring  Notes 

The  enclosed  notebook  on  Rangeland  Watershed  Monitoring  was  compiled  for  the 
BLM  Colorado  Soil,  Water,  and  Air  Workshop  held  April  10-12,  1985.  The 
notebook,  a  reference  source  forftfield  specialists  is  concerned  with  planning 
and  designing  watershed  monitoring  projects. 

The  notebook  is  divided  into  six  chapters.   Chapter  1  defines  monitoring, 
suggests  an  outline  for  monitoring  plans,  and  distinguishes  between  both' 
direct  (sampling)  and  indirect  (modeling^, monitoring^ strategies..  Chapter  2 
reviews  statistical  analysis  packages  available  on  the  BLM  Honeywell  DPS-8 
computer.  Chapters  3-6  (reprints)  describe  various  techniques  and  designs 
that  may  be  used  in  direct  monitoring.-   ;■..;•  .3;.. 

The  Division  of  Resource  Systems  is  developing  additional  information  on 
direct  monitoring  strategies,  sampling  designs,  data  analysis  procedures,  and 
methods  for  interpreting  rangeland  monitoring  data,  e.g.,  cover,  in  terms  of 
runoff  and  soil  loss.   If  you  have  any  questions,  please  contact  Bill  Jackson, 
Division  of  Resource  Systems,  D-471,  FTSr 176-0148,  Comm.  (303)  236-0148;  or 
Shirley  Hudson,  Division  of  Resource  Systems,  D-472,  FTS  776-0152,  Comm.  (303') 
236-0152. 


* 


—•«-*, 


ACTING 


1  Enclosure  ■> 

Enclv  1  -  Rangeland  Watershed  Monitoring  Notes  (227  pp) 


Distribution 

WO  (200),  Room,  5654:,   -  1 

WO  < 201),  premier, c Room  906  -  1 

WO  .(202).,  Premier*  Room  906  -  1 

D-245A  -  1 

D-4O0 ,->  4 

D-470  -  2 

THA7X  >  1 
RAs  t 


C'nec 


IKed  enclose- 


in  ttbraW 


-"•n, 


•" 


-•a:*^ 


.'3 


■vf:..t   .  ',.-,  V 


~vi>;    .;>;.>;.>,i 


'•I*. 


■-■>,» 


y>-*i. 


* 


&wmy\ 


•Vi- 


D    ■    r '■' 


. .  :  jpr     *: 


V>.V< 


6^ 


■  ■>*■*-  ■ 


NOTES:   RANGELAND  WATERSHED  MONITORING 


compiled  for: 


;  .■» v.,  .  u>  ~  i  .  , 

•   ':'r    ''■ 


BLM  Colorado 
Soil,  Water,  Air  Workshop. 
Glenwood  Springs,  Colorado 

April  10-12,  1985  • 


■':-...  :. :  ■■■■  t  •  $  ■  ■  '  i 

';  f    '•_   ,:Vvf  "...  .  ' 


Iw-J-.-  dW' 


'-'*;.  ^ 


BUREAU  OF  LAND  MANAGEMENT  LIBRARY 


Denver ,    Colorado 


88068951 


DenvirrService  Center 
ApTilV::1985 


Bureau  of  Land  Management 

Library 

Bldg.  50,  Denver  Federal  Center 
Denver,  CO  80225 


dZf  ».*       i4j» 


._:.......     ■  [  •: 


CONTENTS 


1.  General  Notes 

2.  Statistical  Packages  on  BLM  Honeywell  Computer 

3.  Reprint:  Water  Quality  Monitoring  Programs 

4.  Reprint:   The  Use  of  the  Paired  Basin  Technique  in  Flow-Related 

Water-Quality  Studies 

5.  Reprint:   Estimating  Soil  Erosion  Using  an  Erosion  Bridge 

6.  Runoff  Plots:  Havre,  Montana 


I  1 


o 


A. 


•,■<' 


•>     i 


n 


J> 


OBJECTIVES 

PART  I:  GENERAL 

.  DEFINE  MONITORING 

.  WHEN  TO  MONITOR 

.  WHAT  TO  MONITOR  (PARAMETERS) 

.  COMPONENTS  OF  A  MONITORING  PLAN 

.  SPECIFIC  STRATEGIES:  DIRECT  MONITORING 

.  SPECIFIC  STRATEGIES:  INDIRECT  MONITORING 

PART  II:  STATISTICS 
.  SAMPLE  DESIGNS 
.  DATA  ANALYSIS 

PART  III:  CASE  STUDIES 


WATERSHED  MONITORING 


THE  COLLECTION,  ANALYSIS  AND  INTERPRETATION 
OF  DATA  TO: 

.  DETERMINE  IF  WATERSHED  MANAGEMENT 
OBJECTIVES  ARE  BEING  ACHIEVED 

.  INSURE  COMPLIANCE  WITH  LAWS, 
STANDARDS,  ETC 


MONITORING    ^    INVENTORY 

BUT,  MONITORING  DATA  MAY: 

.  QUANTIFY  EFFECTS  OF  LAND  MANAGEMENT 
ON  WATERSHED  VALUES 

.  PROVIDE  INFORMATION  FOR  PLANNING 
AND  WATERSHED  ANALYSIS 

.  VALIDATE  /  CALIBRATE  GENERALIZED  MODELS 


WHEN  /  WHY  MONITOR 

NECESSARY  FOR  RESPONSIBLE  MANAGEMENT 
HIGH  WATERSHED  VALUES 
RESOURCE  CONFLICTS 
KNOWLEDGE/INFORMATION  GAPS 
MANAGEMENT  COSTS 
LEGAL  REQUIREMENTS 

WATERSHED  ACTIVITY  PLAN 


WHAT  TO  MONITOR  (PARAMETERS) 

PARAMETERS  SHOULD  RELATE  TO  WATERSHED 
MANAGEMENT  OBJECTIVE 

PARAMETERS  SHOULD: 

BE  NUMERICAL  MEASURE  OF 
WATERSHED  CHARACTERISTIC 

HAVE  KNOWN  OR  DETERMINABLE  RELATIONS 
TO  WATERSHED  CONDITION 

BE  SENSITIVE  TO  MANAGEMENT  CHANGES 


WATERSHED  VALtfE 


GOOD  PARAMETER 


POOR  PARAMETER 


SOIL  PRODUCTIVITY 


SOIL  LOSS 


SUSP.  SED. 


WATER  QUALITY 
CHANNEL  QUALITY 


SUSP.  SED 


CHAN.  GEOM. 
(LONG) 


SOIL  LOSS 


SUSP.  SED. 
PARTICLE  SIZE 


SED.  YIELD 


RES.  SURVEY 


SUSP.  SED. 


FLOODS 


PEAK  FLOW 
(PAIRED) 


PEAK  FLOW 
(SINGLE) 


WATER  SUPPLY 


ANNUAL  Q 
(PAIRED) 


q,  Q 

(SINGLE) 


MONITORING  PUNS 

.  STATEMENT  OF  MANAGEMENT  OBJECTIVE 

.  WORKING  STATEMENT  OF  MONITORING  OBJECTIVE 

IDENTIFICATION  AND  DESCRIPTION  OF  PARAMETERS 
HYPOTHESIS 

.  METHODS  (EG.  EQUIPMENT,  TECHNIQUES) 

.  SAMPLING  DESIGN 

SIGNIFICANCE  LEVELS 

LOCATION 

FREQUENCY 

BLOCKING,  STRATIFICATION,  CO-VARIABLES,  ETC. 

.  DATA  ANALYSIS  PLAN 

.  DATA  INTERPRETATION,  REPORT  PLAN 

.  IMPLEMENTATION  PLAN 
SCHEDULE 
BUDGET 
ETC. 


DIRECT  MONITORING  STRATEGIES 

I.    UPSLOPE  RUNOFF,  SOIL  LOSS 

PAIRED  SAMPLE  PLOTS  -  OLD  EXCLOSURE 
PAIRED  SAMPLE  PLOTS  -  NEW  EXCLOSURE 
UNPAIRED  PLOTS  -  NO  CONTROL 

II.    INSTREAM  DISCHARGE,  SEDIMENT  TRANSPORT 
SINGLE  POINT  -  TREND 
CONTROL  -  UPSTREAM  /  DOWNSTREAM 
CONTROL  -  PAIRED 


INDIRECT  MONITORING  STRATEGIES 


I.    SOIL  LOSS  MODEL 
II.    RUNOFF  MODEL 
.    SEDIMENT  YIELD  MODELS 


UPSLOPE  RUNOFF,  SOIL-LOSS  PLOTS 


DESIGN  A 


4  &- 


J 


DESIGN  B 


DESIGN  C 


DESIGN  D 


INSTREAM  SAMPLING 


DESIGN  A 


Pasture 


DESIGN  B 


DESIGN  C 


SOIL  -  LOSS  MODEL 


USLE:  A=RKLSCP 


USE  PAIRED  PLOTS  TO  VALIDATE  C-PACTOR 
AND  QUANTIPY  EFFECTS  OF  MANAGEMENT 

.  BE  SURE  TO  COLLECT 
C  -  FACTOR  DATA 


RAINFALL  -  RUNOFF  MODEL 


SCS  CURVE  NUMBER  MODEL: 


^      (P-04S)  ,         r       WOO 

Q=  \L I         where  S  =  — -  ~/0 

P+0-8S  CA/ 


USE  RAINFALL  -  RUNOFF  INFORMATION  TO 
CALCULATE  CN 

TEST  SENSITIVITY  OF  CN  TO  MANAGEMENT 
.  BE  SURE  TO  QUANTIFY 
CN  VARIABLES 


SEDIMENT  YIELD  MODELS 


STORM  PERIOD: 


musle  y--ll'2(QpfS6KLSCP 


LONG  TERM 


PSIAC 


Comparison  of  Three  Statistical  Packages  -  STATPACK, 
SPSSX,  and  BMDP  -  Available  on  the  Honeywell 


Characteristic 
Interactive 
How  to  Access 
Quality  of  Manual 


STATPACK 
Yes 
Type  "STPK" 
Poor 


SPSSX 


BMDP 


No  No 

(e.g.)  A363/SPSSXCC   (e.g. )A363/BMDPCC 
Good  Fair 


Best  Use  of  Package  Small,  Uncomplicated  Most  analyses,    Unusual,  Complicated 

analyses        except  the  unusual     analyses 


Type  of  Data  Input   Interactive,  File  with  File-fixed  or 

fixed  or  free  format  3  types  of  free 

Free  must  have  ,  or  /  format 
as  separators 


Will  it  accept  No 

non-numeric  input? 

Will  it  Transform  data?     Yes 

Will  it  accept  missing      No 
values  or  select  only 
a  subset  of  cases? 

Maximum  number  of  cases      250 

Maximum  number  of  15 

variables 


Yes 

Yes 
Yes 

Unlimited 
500 


File-fixed  or  3 

types  of  free 
format  or  FORTRAN 
subroutines 

No 


Yes 
Yes 

Unlimited 
500 


For  more  information  on  statistics  and  statistical  packages,  call: 

Shirley  Hudson     FTS      776-0144 
Comm.  (303)236-0144 


Mike  Garratt 


FTS      776-0096 
Comm.  (303)236-0096 


TV>e    "r»£vu 


.1ST 


I63/SP8SXCC 


\s 


05$*S»J»Ti-M0NI  oaSSooO 

|0$:iDENT:G4AM/??????yD~440    GARR  ATT£—  -your  cWcurc\e,  c*>&e ,    p« 
20$ :  SELECT  :SPSS/X1<—  «-a\\s    a\\    SPSSX    f^ra««i  . 

30*JPRMFLill>R»S,A363/SPSSDATA«-  £;W  «*««    yo^  daX< 

"35INF0    ALL 

50TITLE  'TEST  REGRESSION  FOR  SPSSX' 

60FILE  LABEL  INPUT  DATA  FROM  FILE  SPSSDATA 

70FILE  HANDLE  DATAINPT  /UNIT-11 

80DATA  LIST  F I LE=DATA INPT  /LBS  1-6  TEMP  7-11 

90VARJABLE  LABELS  LBS  'POUNDS  OF  PRESSURE' 

100   TEMP  'TEMPERATURE  IN  DEGREES' 

1 1 OCONDESCR I PT I VE  ALL 

120STATISTICS  ALL 

130NQNPAR  CCRR  LBS  WITH  TEMP 


rd,  *  £■"& 


i: 


EARSON  CORR  LB: 


•I  A 
1  7 


WITH 

:pti 

TEMI 


TEMP 


TICS=ALL/ 


0 


V 


EGRESSION 
UARIABLES-LBS  TEMP/  STATU 

BEPEMDENT=LBS/  ENTER/ 
0   RESIDUALS/  SCATTERPLOT=( LBS? TEMP)/ 

OFINISH 

'200$  I  END  JOB 


TKe"oLD 


l    T  C  T      A  "'  A  "*  ■'' ':»  P  Q  Q  P  P 

L    .».    iS     l  H  *.J  *.J  l2  /      i»M      *J  O  W  U* 


05$$S  ?  J»  T  j  MONI  »         _,.ec \      .     -r-Ts 

10*:iDENTtG4AM/?T?TTT,D-440    GARRATT  «-  y«f  «**<V-  «><*«,    password  >  *    A.» 

20$  :  SELECT!  SPSS/SPSS  <-«toVls  a\\    SPSS     pro^rawNS  . 

30$  :  PRMFL  :  08  ^  R  »  S  r  A363/SPSSDATA  «—  £;\«.  ^Ve^e     ->y  our     dojfc-    « 

4  0  R  U  N    N  A  M  E  J  T  E  S  T    R  E  6  R  E  S  S 1 0  N 
50FILE    NAMEtSPSSDATA 

VARIABLE    L I  ST t LBS t TEMP 
OINPUT    F0RMATiFIXED<F6.0»F5.0) 
SONEU    REGRESSION t DESCRIPTIUES=ALL/ 
90:  :lv'ARIABLES  =  LB3    TEMP/    STATIST ICS  =  ALL/ 

ioo : :bependent*lbs/  enter/ 

0:  5  RESIDUALS/  SCATTERPLQT=*<  LBS  t  TEMP)/ 
120REAB  INPUT  DATA 
130FIMISH 

I.  40$:  END  JOB 


X-?    yo^     InaMt     pT6\>Wms   -  tall 


fts 


Cores  vrs 


S  T  A  T  F"  A  C 


CALL    DAM    RERCKE    AT 

COMM    303-23o-"65i2 
I'D    OBTAIN    A    COPY     OF 
SECTION   1  THE    ST  AT  PACK    MANUAL 

IHTHODDCTION 


The  primary  purpose  of  the  Statistical  Package  (STATPACX)  is  to  make 
available  computer  programs  which  permit  the  user  to  interact  with  the 
computer  through  a  remote  terminal  while  he  is  performing  statisical 
analyses.  STATPACK  consists  of  21  analyses.  These  programs  are 
designed  to  ask  the  user  to  enter  his  problem  parameters  and  variables 
and  to  make  decisions  at  certain  key  points  in  the  analyses.  The 
communications  between  the  user  and  the  computer  is  carried  out  in  a 
conversational  manner.  Thus,  the  user  can  solve  his  problem  from  a 
remote  terminal  without  knowing  anything  about  computers  and 
programming.  However,  it  is  assumed  that  anyone  using  the  various 
analyses  is  familiar  with  the  concepts  involved.  The  computational 
procedures  used  in  each  analysis  are  outlined.  Additional  information 
on  the  concepts  and  the  computations  may  be  found  in  the  description 
of  that  analysis. 

While  this  package  includes  many  of  the  tools  necessary  to  solve  the 
more  commonly  encountered  statistical  problems,  there  is  no  intent  to 
imply  that  these  analyses  represent  the  state  of  the  art  in  statis- 
tics. As  with  all  tools,  the  user  should  understand  their  capabili- 
ties and  their  application  to  his  functional  requirements  before 
deciding  to  use  them. 


CHARACTERISTICS 

Some  of   the  characteristics  of   the  Statistical  Package  are  as  follows: 

1.  Communication  between  the  user  and  the  computer  is  conversational. 

2.  The  programs  can  be  run  from  most  remote  teletype  terminals. 

3.  The  programs   perform  error  checking   for  input   parameters   and   data 
and  give  the  user  the  opportunity  to  correct  possible  errors. 

4.  Data  may  be  added,    replaced,    deleted,    transformed  or   generated   at 
the  user's   specifications. 

5.  Multiple  data  sets  may  be  processed  from  an  input  file. 


TYPE    STPK    TO    RUN    STATPACK  ♦       TYPE     SOS       FOR    EXPLANATIONS 
AT    ANY    STAGE    OF    THE    PROGRAM. 


STATPACK 

TABLE  OF  CONTENTS- 
description  PAGE 

SECTION  1.   INTRODUCTION  1-1 

CHARACTERISTICS  1-1 

LIMITATIONS  1-2 

PRECISION  1-2 

HOW  TO  COMMUNICATE  WITH  THE  COMPUTER  1-2 

How  to  Start  1-2 

Use  of  the  Code  SOS  1-3 

Abbreviated  Commands  1-3 

YES  or  NO  Reply  1-3 

Reply  with  Information  1-3 

Entering  Data  1-4 

Specifications  of  the  Retained  Variables  1-4 

How  to  Stop  1"* 

SECTION  2.      USING  THE   SYSTEM  2-1 

INPUT  FILES  2-2 

Example  of  Basic  Programs  2-3 

EXAMPLES  OF  ANALYSIS  2-4 

Sample  Problem  2-4 

SECTION  3.      DESCRIPTION  OF  ANALYSES  3-1 

I     EDIT  3-1 

9-    TRANSFORMATION  3-4 

3   ELEMENTARY  STATISTICS  3-7 

LJ,    CORRELATION  3-9 

5  CROSS  TABULATION  3-11 

<L-  SCATTER  DIAGRAM  3-14 

1  HISTOGRAM  3-16 

9,  LINE  PLOT  3-18 

i   RANK  CORRELATION  3-21 

10  CHI-SQUARE  3-23 

I  I  t-TEST  3-27 

1 2.  REGRESSION  3-30 

13  STEPWISE  REGRESSION  3-35 

14  MULTIPLE  REGRESSION  3-43 

15  POLYNOMIAL  REGRESSION  3-48 
|  Co  ANALYSIS  OF  VARIANCE  3-53 

Example  3-56 

Problem  Limitations  3-57 

\1  CANONICAL   CORRELATION  3-60 

IS  FACTOR  ANALYSIS  3-65 

l<7   DISCRIMINANT  ANALYSIS  3-72 

20  EXPONENTIAL   SMOOTHING  3-78 

21  PROBIT  ANALYSIS  3-82 


vi 


fl 


BMDP  Statistical  Sottware 

1981 


W.J.  Dixon,  chief  editor 

M.  B.  Brown 

L.  Engelman 

J-  W.  Frane 

M.  A.  Hill 

R- 1.  Jennrich 

J.  D.  Toporek 


i  s 


T  T      !    ■    r- 


i  rue  FOLLOWING  four  FILES  to  3  :>  r  a  r  r,  r^c^nrrv. 


3  '■!  D  P  /  P  3  /^»'J?  /8?>  BI  MT  90 

'^  Vl  t> c  /  ?  I  /  w  Ru°/AD\/ANCen 
~>''0P/i  .''./UP'J^/O^GES 
■-  ?  £  3  /  c  ,'-i  D  P  C  C 


Th  = 


INTRODUCTORY     G'JIOE     '0     =  —  =     (JO     duGccx 
A  0  V  A  -\  C  E  0     F  P  A  T  (.  3  g  <;     or     fvyc     f  ■;  I     ,    .  .- .-  ^  ?  ' 
C  H  A  \J  G  g  s     5  T  \;  C  =     p^p.p, 
EXAvpLc     Fvr)p-S7     CATC"     Hf 


Ebrsau  of  Land  Management 
y  Library 

Denver  Service  Center 


UN.VERS.TY  OF  CAL.FORN.A  PRESS 


BERKELEY  .  LOS  ANGELES 


LONDON  •  1981 


APPENDIX  D 


X^ 


How  To: 


Request  Copies  of  the  BMDP  Programs 

Obtain  Additional  Manuals 

Report  Difficulties  and  Suspected  Errors 

Arrange  a  Visit  by  a  BMDP  Statistician 

Obtain  BMDP  Communications 

Obtain  BMDP  Technical  Reports 


Copies  of  the  BMDP  Programs 

BMDP  programs  are  written  in  FORTRAN  (to  maximize 
portability)  and  distributed  on  tape  as  FORTRAN  card 
images.  The  BMDP  programs  are  also  distributed  as 
load  and  object  modules  for  IBM$360/370  computers. 

Since  there  ar&  some  differences  in  FORTRAN  from 
computer  to  computer,  we  have  established 
redistribution  centers  for  other  computer  systems. 

Requests  for  copies  of  the  IBM  version  of  the 
BMDP  programs  must  be  made  on  one  of  our  request 
forms.  For  our  Tape  Copy  Request  Brochure,  call  or 
write: 

BMDP  Program  Librarian 
Department  of  Biomathematics 
University  of  California 
Los  Angeles,  CA  90024 
(213)  825-5940 

Current  addresses  and  telephone  numbers  of  the 
redistribution  centers  for  other  computer  systems 
are  included  in  the  Tape  Copy  Request  Brochure.  Note 
that  versions  obtained  from  redistribution  centers 
may  not  be  as  recent  as  those  obtained  from  us.  As 
of  1981  there  are  redistribution  centers  for  the 
following  computer  systems: 

COC  6000/CYBER 

Honeywell 

Univac  1100  series 

Univac  70/90  series 

PDP-11 

HP-3000 

Burroughs 

Siemens 

Vax 

DEC  10/20 

Perkin-Elmer  (Interdata) 

ICL  System  4  and  2900  series 

Hitachi  Hitac  Series 

Fujitsu  Facom  series 

Xerox  Sigma  7 

Telefunken 

MODCOMP  Classic 

Prime 

We  encourage  users  of  other  computer  equipment  to 
become  redistribution  centers  and  so  avoid 
duplication  of  effort. 


BMDP  Manuals 

BMDP  Manuals  can  be  ordered  from 

University  of  California  Press 
2223  Fulton  Street 
Berkeley,   CA     94720 
(415)  642-4243 

or 

University  of  California  Press,  Ltd. 

Ely  House 

37  Dover  St. 

London,  W1X  4HQ 

England 

Telephone:  01-499-4688   and 

01-493-5061 
Telex:     24224  ref  3545 
Cables:    CCJ  London  Wl 

A  copy  of  the  manual  is  supplied  with  each  Tape  Copy 
from  us. 

The  following  publications  are  also  available: 

BMDP  User's  Digest  -  a  140-page  condensed  guide  to 
the  BMDP  programs.  Coat  pocket  size. 

BMDP  Pocket  Guide  -  a  concise  summary  of  the  BMDP 
Control  Language  in  a  convenient  shirt  pocket  size. 

BMDP  Reference  Card  -  a  quick  reference  list  of  BMDP 
Control  Language  instructions. 

These  three  publications  can  be  ordered  from 

BMDP  Statistical  Software 

P.O.  Box  24A26 

Los  Angeles,  CA  90024 

Difficulties  and  Suspected  Errors 

We  are  most  anxious  to  improve  the  quality  of  the 
BMDP  programs.  To  report  an  error,  please  send  a 
complete  listing  of  the  input  and  output  to 

BMDP  Statistical  Software 
Department  of  Biomathematics,  UCLA 
Los  Angeles,  CA  90024 


708 


Contents 


1.   Introduction 


1.1  A  Guide  to  This  Manual 


2.   Data  Analysis  —  Using  the  BMDP  Programs 

2.1  Basic  Terminology  for  Common 
Features  and  Statistics 

2.2  Overview  of  Data  Analysis  with  BMDP 


3.   Using  BMDP  Programs 

3.1  Examples  with  Input  and  Output 

3.2  Research  Forms,  Coding  Sheets  and 
Keypunching  Cards 

3.3  Specifying  the  Data  Format 


Features  Common  to  All  Programs 

4.   Requirements  for  an  Analysis  —  BMDP 
System  Instruction 

4.1  The  BMDP  Instruction  Language 

4.2  System  Cards 

4.3  Common  Errors  in  Specifying  BMDP 
Instructions 


5.   Describing  Data  and  Variables  with 
BMDP  Instructions 


1 
2 

3 
3 
6 

15 

15 
21 

21 


25 


25 
30 
31 


35 


5.1  Input  and  Output  Examples  Illustrating  36 
Instructions  Common  to  All  BMDP 
Programs 

5.2  The  PROBLEM  Paragraph  38 

5.3  Reading  the  Data:  the  INPUT  Paragraph  39 

5.4  Describing  the  Variables:  the         40 
VARIABLE  Paragraph 

5.5  The  GROUP  or  CATEGORY  Paragraph        43 

5.6  Free  Format  Data  Reading  (new  in  1981)  44 

5.7  Additional  Options  and  Features  for    46 
Formatted  Data  Reading  (new  in  1981) 

5.8  Multiple  Problems  and  Subproblems      47 


6.   Data  Editing  —  Variable  Transformation     49 
and  Case  Selection 

6.1  Data  Editing  and  Transformations  —    50 
The  BMDP  TRANSFORM  Paragraph 

6.2  FORTRAN  Transformations  Using  the      56 
BIMEDT  Procedure 

6.3  PIS  —  Multipass  Transformations       59 


7.   The  BMDP  File  --  Saving  D^ta  and  65 

Statistics  for  Further  Analysis 

7.1  Examples  of  Writing  and  Using  a       65 
BMDP  File 

7.2  What  is  Stored  in  a  BMDP  File         67 

7.3  Creating  a  BMDP  File:  the  SAVE         68 
Paragraph 

7.4  Reading  the  BMDP  File:  the  INPUT      69 
Paragraph 

7.5  Additional  BMDP  File  Features         70 


Types  of  Analyses  —  Program  Descriptions 

8.  Data  Description  73 

8.1  P1D  —  Simple  Data  Description  and     74 

Data  Management 

8.2  P2D  —  Detailed  Data  Description,      80 

Including  Frequencies 

8.3  P4D  —  Single  Column  Frequencies  —    86 

Numeric  and  Nonnumeric 


9.   Data  in  Groups  —  Description,  t  Tests      93 
One-Way  and  Two-Way  Analysis  of  Variance 

9.1  P3D  —  Comparison  of  Two  Groups  with   94 

t  Tests 

9.2  P7D  —  Description  of  Groups  (Strata)  105 

with  Histograms  and 
Analysis  of  Variance 

9.3  P9D  —  Multiway  Description  of  Groups  116 


10.   Plots  and  Histograms 


123 


10.1  P5D  —  Histograms  and  Univariate      124 

Plots 

10.2  P6D  —  Bivariate  (Scatter)  Plots      133 


11.   Frequency  Tables  143 

P4F  —  Two-Way  and  Multiway  Frequency      143 
Tables  —  Measures  of  Association 
and  the  Log-Linear  Model 
(Complete  and  Incomplete  Tables) 


12.   Missing  Values  —  Patterns,  Estimation     207 
and  Correlations 

12.1  P8D  —  Correlations  with  Options      209 

for  Incomplete  Data 

12.2  PAM  —  Description  and  Estimation     217 

of  Missing  Data 


13.  Regression 

13.1  P1R  —  Multiple  Linear  Regression 

13.2  P2R  —  Stepwise  Regression 

13.3  P9R  —  All  Possible  Subsets 

Regression 

13.4  P4R  —  Regression  on  Principal 

Components 

13.5  P5R  —  Polynomial  Regression 


14.  Nonlinear  Regression  —  And  Maximum 
Likelihood  Estimation 

14.1  P3R  —  Nonlinear  Regression 

14.2  PAR  —  Derivative-Free  Nonlinear 

Regression 

14.3  Applications  of  Nonlinear  Regression 
Algorithms  Including  Maximum 
Likelihood  Estimation 

14.4  System  Cards  for  P3R  and  PAR 

14.5  PLR  —  Stepwise  Logistic  Regression 


15.   Analysis  of  Variance  and  Covariance 

15.1  P1V  —  One-Way  Analysis  of  Variance 

and  Covariance 

15.2  P2V  —  Analysis  of  Variance  and 

Covariance,  Including 
Repeated  Measures 

15.3  P4V  —  General  Univariate  and 

Multivariate  Analysis  of 
Variance  and  Covariance, 
Including  Repeated  Measures 

15.4  P3V  —  General  Mixed  Model  Analysis 

of  Variance 

15.5  P8V  —  General  Mixed  Model  Analysis 

of  Variance  —  Equal  Cell 
Sizes 


16.  Nonparametric  Analysis 

P3S  —  Nonparametric  Statistics 

17.  Cluster  Analysis 


235 

237 

251 
264 

278 

283 


289 


290 
305 

315 


328 
330 


345 
347 
359 

388 

413 

427 

437 

437 

447 


17.1  P1M  —  Cluster  Analysis  of  Variables  448 

17.2  P2M  —  Cluster  Analysis  of  Cases  456 

17.3  PKM  —  K-Means  Clustering  of  Cases  464 

17.4  P3M  —  Block  Clustering  474 


18.  Multivariate  Analysis 


479 


18. 1  P4M  —  Factor  Analysis  480 

18.2  P6M  —  Canonical  Correlation  Analysis  500 

18.3  P6R  —  Partial  Correlation  and       509 

Multivariate  Regression 

18.4  P7M  —  Stepwise  Discriminant  Analysis  519 

18.5  P8M  —  Boolean  Factor  Analysis        538 

18.6  P9M  —  Linear  Scores  for  Preference   547 

Pairs 


19. 


20. 


Survival  Analysis 

19.1  P1L  —  Life  Tables  and  Survival 

Functions 

19.2  P2L  —  Survival  Analysis  with 

Covariates  —  Cox  Models 


Time  Series 


20.1  PIT  --  Univariate  and  Bivariate 

Spectral  Analysis 

20.2  P2T  —  Box-Jenkins  Time  Series 

Analysis 


Appendices 


555 

5  57 
576 

595 
604 

639 


A.   Computational  Procedures 

A. 1  Random  Numbers 

A. 2  Method  of  Provisional  Means 

A. 3  Hotelling's  T2  and  Mahalanobis  D 

(P3D) 
A.4  Bartlett's  Statistic  (P9D) 
A. 5  Tests  and  Measures  in  the  Two-May 

Frequency  Table  (P4F) 
A.6  The  Log-Linear  Model  (P4F) 
A. 7  Parameter  Estimates  and  Standard 

Errors  for  Log-Linear  Models  (P4F) 
A. 8  Instructions  for  P1F,  P2F  and  P3F 
A.9  Estimating  (Smoothing)  the  Missing 

Value  Correlation  Matrix  (PAM) 
A. 10  Replacing  Missing  Values  (PAM) 
A. 11  Linear  Regression  —  Estimating  the 

Coefficients  (P1R  or  P2R) 
A. 12  Residual  Analysis  in  P9R 
A.  13  Regression  on  Principal  Components 

(P4R) 
A. 14  Polynomial  Regression  (P5R) 
A. 15  Nonlinear  Least  Squares  (P3R) 
A. 16  Derivative-Free  Nonlinear  Regression 

(PAR) 
A. 17  One-Way  Analysis  of  Variance  and 

Covariance  (P1V) 
A. 18  Analysis  of  Variance  and  Covariance, 

Including  Repeated  Measures  (P2V) 
A. 19  General  Mixed  Model  Analysis  of 

Variance  (P3V) 
A. 20  General  Univariate  and  Multivariate 
Analysis  of  Variance  and  Covariance 
(P4V) 
A. 21  Maximum  Likelihood  Factor  Analysis 

(P4M) 
A. 22  Canonical  Analysis  (P6M) 
A. 23  Stepwise  Discriminant  Analysis  (P7M) 
A.  24  Using  BMDP  Programs  from  a  Terminal 
A.  25  Missing  Value  Covariance  Matrices 

(P8D) 
A. 26  Classification  Functions  (P7M) 
A.  27  General  Mixed  Model  AN0VA  —  Empty 

Cells  (P3V) 
A. 28  Logistic  Regression  (PLR) 
A. 29  K-Means  Clustering  (PKM) 


661 
662 
662 

663 
663 

666 

667 

668 

671 

671 
671 

672 
672 

673 
673 
674 

675 

676 
677 
673 

673 

679 
679 
681 
681 

683 

683 

633 
684 


vi 


A. 30  Survival  Functions  (P1L) 

A. 31  Regression  with  Incomplete  Survival 

Data  (P2L) 
A. 32  Univariate  and  Bivariate  Spectral 

Analysis  (PIT) 
A. 33  Box-Jenkins  Time  Series  Analysis 

(P2T) 
A.  34  Boolean  Factor  Analysis  (P8M) 
A. 35  Linear  Scores  for  Preference  Pairs 

(P9M) 
A. 36  Robust  Estimators  (P2D) 
A. 37  Cluster  Analysis  of  Cases  (P2M) 


B.   Size  of  Programs 

B. 1  Increasing  the  Capacity  of  BMDP 

Programs 
B.2  Program  Limitations 


C.   Selected  Articles  from  BMDP  Communications 

C.l  Detecting  Outliers  with  Stepwise 

Regression  (P2R) 
C.2  It  Wasn't  an  Accident  (F-to-enter, 

F-to- remove) 
C.3  Scaling  for  Minimum  Interaction 

Using  BMDP6M 
C.4  Computing  Predictions 
C.5  Tolerance  in  Regression  Analysis 
C.6  Random  Case  Selection 
C.7  Ridge  Regression  Using  BMDP2R 
C.8  Analysis  of  Multivariate  Change 

Scores 
C.9  Checking  Order  of  Cards  In  a  Data 

Deck 
CIO  Quick  and  Dirty  Monte  Carlo 
C.ll  First  Steps 
C.12  Using  P1D  to  Identify  and  List  Cases 

Containing  Special  or  Unacceptable 

Values 
C. 13  The  Iterated  Least  Squares  Method  of 

Estimating  Mean  and  Variance 

Co-.ponents  Using  P3R 
C.14  Lagged  Variables  Using  Transformation 

Paragraph 
C.15  Cross  Validation  in  BMDP9R 
C. 16  Partial  and  Canonical  Correlation 


684 
684 

686 

690 

692 
692 

693 
693 


694 
694 

698 

698 

699 

700 
700 
701 

701 
702 

702 

703 
703 
704 

705 

706 

707 
707 

708 


D.  How  To: 

Request  Copies  of  the  BMD  and  BMDP  Programs 

Obtain  Additional  Manuals 

Report  Difficulties  and  Suspected  Errors 

Arrange  a  Visit  by  a  BMDP  Statistician 

Obtain  BMDP  COMMUNICATIONS 

Obtain  BMDP  Technical  Reports 


E.   IBM  OS  System  Cards  Required  to  Use  BMDP    713 
Programs 


REFERENCES 
INDEX 


714 
720 


Tables 

Table  Title  Page 

2.1    Features  common  to  BMDP  programs         7 
5.1    Werner  et  al  (1970)  blood  chamlstry      37 

data 
6.1    BMDP  transformations  available  in       51 

the  TRANSFORM  paragraph 
8.1    Nineteen  cases  of  data  containing       89 

nonnumeric  characters 

11.1  Data  on  117  male  coronary  patients      147 
patients 

11.2  Tests  and  measures  computed  by  P4F      152 

11.3  Incidence  of  peptic  ulcer  by  blood      170 
group  (Woolf) 

11.4  Three-year  survival  of  breast  cancer    178 
patients 

14.1  Radioactivity  in  the  blood  of  a        290 
baboon  named  Brunhilda 

14.2  Radioactivity  corresponding  to  levels   297 
of  insulin  standard 

14.3  Data  on  sensitivity  of  Chorda  tympani   318 
fibers  in  rat's  tongue  to  four  test 
stimulus  sources 

15.1  Data  from  Afifi  and  Azen  (1972,  p. 166)  360 
omitting  the  same  cases  omitted  by 

Kutner  (1974) 

15.2  Numerical   example  from  Winer   (1971,  366 
p. 525) 

15.3  Numerical   example   from  Winer   (1971,  368 
p. 564) 

15.4  Numerical  example  from  Winer  (1971,  370 
p. 546) 

15.5  Numerical  example  from  Winer  (1971,  373 
p. 806) 

15.6  Numerical  example  from  Winer  (1971,  375 
p. 803) 

15.7  Life   times  of  electronic  components  418 

15.8  Random  effects  model   analysis  of  418 
variance 

15.9  Maximum  likelihood  analysis  for   the  419 
random  effects  model 

15.10  Maximum  likelihood  analysis   for   the  421 
mixed  model 

15.11  Mixed  model   analysis   of  variance  421 
with  no   interaction 

15.12  Maximum  likelihood  analysis   for   the  421 
mixed  model   with  no   interaction 

16.1         Data   from  Siegel   (1956,   p.   253)  438 

17.1  Data   from  Jarvik's   smoking  448 
questionnaire,   administered   to    110 
subjects 

17.2  Health  indicators  456 

18.1  Fisher   iris   data  520 

18.2  Data  from  a  study  of  lymphocyte  539 
blood  cells 

19.1  Survival   of   lung   cancer   patients  557 

19.2  Artificial   dates   giving   rise    to    the  570 
survival    times   in  Table   19.1 


Vii 


2.2  Data  Analysis 


Appendix  C.ll).  Specific  program  options  are 
presented  in  the  program  descriptions.  Some  of  the 
more  advanced  techniques,  such  as  maximum  likelihood 
estimation,  are  discussed  only  in  the  program 
descriptions. 


Data  Screening  and  Description 

The  first  step  in  an  analysis  is  to  examine  the 
data  for  errors  and  for  the  appropriateness  of 
assumptions  to  be  used  in  the  analysis  (such  as 
normality) .  If  errors  remain  in  the  data  they  can 
cause  a  "garbage-in,  garbage-out"  analysis.  Blunders 
or  extreme  outliers  in  the  data  may  need  to  be 
removed  to  achieve  a  meaningful  analysis.  The  data 
may  need  to  be  transformed  to  fit  the  various 
assumptions  (constant  variance,  normality,  etc.) 
required  by  the  statistical  model. 

After  the  original  data  have  been  recorded, 
various  descriptive  characteristics  of  the  data  can 
be  used  to  detect  gross  errors  in  the  observations, 
in  coding  the  data,  in  including  inappropriate 
cases,  etc.  A  good  place  to  begin  screening  is  to 
check  for 

-  symbols  or  characters,  such  as  letters  where 
numbers  should  be  (P4D  counts  all  distinct 
characters  for  each  column  of  data,  one  column  at  a 
time);  many  programs  will  not  run  if  nonnumeric 
symbols  are  in  the  data  used  for  analysis  -  error 
messages  are  reported,  however,  by  all  programs  when 
illegal  characters  are  found. 

-  outliers  or  blunders  (P2D  can  be  used  to  obtain 
a  small  histogram  and  frequency  counts  for  all 
distinct  values  of  each  variable) 

Listing  the  cases  by  one  of  the  methods  described  on 
p.  5  may  also  locate  problems  in  the  data. 

Outliers  can  be  identified  by  multivariate 
screening.  For  each  case  P4M  prints  the  Mahalanobis 
distance  squared  from  the  case  to  the  center  of  all 
cases.  Within  group  multivariate  outliers  can  be 
identified  by  using  P7M,  which  prints  the 
Mahalanobis  distance  squared  from  the  case  to  the 
center  of  each  group.  P9R  also  prints  distance 
measures  helpful  for  identifying  unusual  cases. 

Univariate  descriptive  statistics  are  found  in 
most  programs,  but  especially  in  P2D  and  P1D.  For 
example,  from  the  cumulative  percentiles  printed  in 
P2D  for  each  distinct  value,  you  can  make  summary 
statements  such  as,  "sixty  percent  of  the  patients 
are  in  the  50-60  age  group,  while  only  seven  percent 
.  are  in  their  twenties",  etc.  A  stem  and  leaf 
histogram  is  also  available  in  this  program. 


Data  in  Groups 

In  screening,  you  often  need  to  examine  groups 
(strata  or  subpopulations)  of  the  data.  Unusual  data 
values  that  are  masked  in  a  total  population  may 
stand  out  when  the  data  are  separated  into  groups  or 
strata.  Some  variables  are  easily  coded  into  groups, 
such  as  sex  (males-1,  females-2).  Continuous 
variables  can  be  categorized  by  a  grouping  variable. 

P7D  is  especially  powerful  for  examining  groups; 
it  prints  histograms  (side-by-side  for  each  group) 
and  statistics  for  each  group;  it  also  provides  a 
choice  of  one-way  or  two-way  analysis  of  variance  to 
check  group  differences.  From  this  output  you  can 
identify  extreme  outliers,  obtain  an  idea  of  the 


distribution  of  data  within  groups,  and  examine 
whether  the  assumption  of  normality  is  reasonable. 
Heteroscedasticity  (lack  of  constant  variance  over 
groups)  can  also  be  observed  and  tested,  and  may 
indicate  that  the  input  data  should  be  transformed. 

An  analysis  of  variance  using  P7D  can  indicate 
whether  group  differences  are  large  enough  to 
suggest  that  future  analyses  should  be  stratified. 
P7D  also  computes  ANOVA  tests  that  do  not  assume 
equal  group  variances,  plots  cell  standard 
deviations  vs.  cell  means  and  reports  Bonferroni 
probabilities  for  pairwi3e  tests  of  cell  means.  More 
information  on  group  differences  (both  univariate 
and  multivariate)  can  be  obtained  by  using  P3D.  It 
yields  t  statistics,  Hotelling's  Ta  and  Mahalanobis 
D2  for  each  pair  of  groups;  t  statistics,  based  on 
both  pooled  and  separate  variance  estimates,  are 
printed  in  the  output.  A  trimmed  t  test  Is  available 
in  the  1979  version.  The  Levene  test  for  equality  of 
variances  is  in  both  P3D  and  P7D. 

When  the  cases  are  classified  by  more  than  one 
grouping  variable  or  factor,  P9D  (Multiway 
Description  of  Groups)  can  be  used  to  compute  cell 
frequencies,  means  and  standard  deviations.  Grouping 
variables  can  be  suppressed  to  obtain  information 
about  marginal  cells.  The  program  tests  for  the 
equality  of  cell  frequencies  and  cell  means  and  for 
homogeneity  of  cell  variances.  These  tests  are 
performed  on  all  cells  or  on  specified  marginals. 
Cell  means  are  plotted  four  variables  per  page  in  a 
compact  graphical  display  scaled  by  the  overall  mean 
and  standard  deviation.  This  display  is  helpful  for 
understanding  interactions  in  more  complex  ANOVA 
designs. 


Transformations 

After  screening  and  describing  your  data,  you 
should  be  ready  to  make  decisions  regarding 
transformations.  The  transformed  data  can  be  put 
directly  on  a  BMDP  File  ready  for  easy  input  into 
any  other  BMDP  program.  Although  all  programs  can 
perform  data  transformations,  you  may  need  to  use 
PIS,  the  multipass  transformation  program,  for 
getting  the  data  transformed  and  ready  for  further 
analyses.  PIS  can  be  used  when  your  transformation 
requires  more  Chan  one  pass  through  the  data. 


Plots  and  Histograms 

Many  research  workers  like  to  see  their  data  in 
graphical  form;  scatter  plots,  for  example,  are  a 
good  way  to  present  information  concisely  and 
clearly  in  final  reports.  Scatter  plots  that  cake 
advantage  of  known  information  can  be  designed  Co 
display  unusual  cases  or  oucliers  —  for  example,  Co 
show  whecher  or  not  an  individual's  systolic  blood 
pressure  level  is  higher  than  his  diastolic  level.  A 
scatter  plot  of  these  cwo  variables  will  show  if  the 
data  coding  is  mistakenly  reversed  for  some  cases. 
Or  in  a  plot  of  height  versus  weight,  a  case  that 
has  a  height  of  72  inches  and  225  lbs.  will  clearly 
stand  out  if  Che  height  is  mispunched  as  52  inches. 

A  grouping  variable  can  be  used  in  the  P6D 
scatter  plot  program  to  provide  Informacion  about  a 
variable  not  used  as  the  plot  axes.  If  age,  for 
example,  is  divided  into  groups  —  less  Chan  or 
equal  Co  15,  16-35,  36-55  and  over  55  —  Che  leccers 
A,  B,  C  and  D  are  used  to  represent  cases  from  each 
age  group.  When  Cwo  ocher  measuremencs  for  Che 


I 


w\*w> 


mmtm 


Data  Analysis  2.2 


subjects  are  plotted,  the  children  may  appear  in  a 
separate  area  of  the  plot,  indicating  that  they 
should  be  analyzed  separately  in  later  analyses.  P6D 
can  also  perform  a  simple  regression  analysis  for 
the  data  in  a  scatter  plot.  This  analysis  may 
indicate  whether  or  not  an  analysis  of  covariance 
should  be  used  later.  Variables  can  be  plotted 
against  time  of  entry  into  a  study  to  see  if 
observations  are  independent,  or  if  a  drift  over 
time  is  occurring. 

P5D  can  print  a  histogram  for  all  the  data  or  for 
one  or  more  groups,  each  identified  by  a  different 
letter.  You  can  specify  the  scales  of  the  histogram 
to  produce  a  histogram  suitable  for  a  final  report. 

Normality  can  be  roughly  checked  by  looking  at 
histograms  in  P7D  or  P5D.  P5D  can  also  print  a 
normal  probability  plot  that  provides  a  better 
assessment  of  normality  and  helps  to  identify 
outliers.  ' 


Frequency  Tables 

CroS3  tabulations,  are  frequently  used  as  a  form 
of  final  reporting  to  give  a  picture  of  the  number 
of  cases  in  specified  categories  (or 
cross-classifications).  Tables  can  be  formed  from 
data  or  from  cell  frequencies.  Tables  can  also  be 
formed  for  each  level  of  a  third  variable  (such  as 
separately  for  males  and  females).  Twenty-three 
statistics  appropriate  for  the  analysis  of 
contingency  tables  are  available  in  P4F  (which 
includes  all  the  features  formerly  contained  in 
programs  P1F,    P2F  and   P3F). 

P4F  can  test  whether  rows  are  independent  of 
columns  using  the  frequencies  in  all  cells.  P4F  can 
also  test  the  same  hypothesis  using  any  subset  of 
tne  cells;  for  example,  are  rows  independent  of  the 
columns    for    all    cells,    excluding    the    cells    on    the 

"v;; al \  4F  can  als°  ide«"y  "Us  that 

contribute  heavily  to  a  significant  chi-square  test 
or  independence. 

bv  P4FCiT^  fr^uenc?  ta«"  "e  formed  and  analyzed 
by  P4F.  A  log-linear  model  can  be  fitted  to  the  cell 
frequencies  and  the  fit  tested.  P4F  can  be  used  to 
select  an  appropriate  model  for  the  data  and  to 
estimate  the  parameters  of  the  model. 

Missing  Values 

All  too  often  the  data  recorded  are  not  complete 
and  some  values  are  missing.  These  missing  values 
are  usually  left  blank  or  coded  by  a  special  code 
called  the  missing  value  code".  Missing  values,  and 
unusually  extreme  values  that  appear  to  be  wrong , 
are  excluded  from  an  analysis. 

PAM  lists  cases  containing  missing  values  or  data 
Co  be  excluded  from  the  analysis,  computes  the 
percentage  of  missing  data  for  each  variable,  and 
reports  special  patterns  in  the  data.  PAM  can  also 
estimate  values  to  replace  the  missing  value  code 
(or  excluded  values)  based  upon  the  data  present  in 
the  case. 

Most  regression  and  multivariate  analyses  require 
complete  cases;  i.e.,  no  missing  or  excluded  values 
in  any  case.  Many  of  these  analyses  can  begin  from  a 
correlation  or  covariance  matrix.  Both  PAM  and  P8D 
can  estimate  correlations  using  cases  with  some  data 
missing;  the  correlation  matrix  can  then  be  stored 
in  a  BMDP  File  and  used  as  input  to  other  programs, 


including  those  that  require  complete  data.  PAM 
insures  that  the  resulting  correlation  matrix  is 
numerically  appropriate  (positive  semidef inite)  for 
a  regression  or  factor  analysis;  P8D  allows  you  to 
choose  between  four  methods  to  compute  the 
correlations. 


Regression 

A  regression  analysis  studies  the  relationship 
between  a  dependent  variable,  y,  and  one  or  more 
independent  variables,  *..  The  linear  least  squares 
model  with  parameters  or  regression  coefficients, 


i' 


can  be  written 

y  -  S0  +8tx, 

For  simple  linear  regression  (x 


x     +  e 

P  P 


is  the  only 
independent  variable  in  the  model),  P6D,  P1R  and  P2R 
can  be  used.  If  there  are  several  independent 
variables,  P1R,  P2R  or  P9R  can  be  used  to  perform 
multiple   linear   regression   analyses. 

P1R,-    P2R    and    P9R     differ     in     three     important 
respects: 

-  the     criterion     for     including     independent 
variables    in   the   multiple   linear   regression 

-  the    ability    to    repeat    the    analysis    on    subgroups 
of   the   cases   and    to   compare   the   subgroups 

-  the    residual   analysis   available 

P1R  includes  all  the  specified  independent 
variables  in  the  multiple  regression  equation.  It 
computes  a  multiple  linear  regression  on  all  the 
data  and  on  groups  or  subpopulations -  If  grouping  is 
requested,  P1R  first  analyzes  all  cases  combined  and 
then  analyzes  each  group  separately.  After  all 
groups  have  been  analyzed,  the  regression  equations 
are    tested   for   equality  between  groups. 

P2R  computes  the  multiple  linear  regression  in  a 
stepwise  manner.  At  each  step  it  enters  Into  the 
regression  equation  the  variable  that  best  helps  to 
predict  y  or  removes  the  least  helpful  variable. 
Several  criteria  are  available  for  entering  or 
removing  variables  from  the  equation  (see  P2R 
program  description).  A  stepwise  procedure  Is  useful 
for  identifying  a  good  set  of  predictor  variables 
(separating  the  most  important  variables  from  those 
nttlrfX  ""^^  necessary  at  all),  and  when 
sufficient      preliminary      information      regarding       the 

;"^rMrS  °f  the  indeP"dent  variables  is  not 
available.  In  practical  applications  the  stepwise 
procedure    is   often   a   satisfactory  solution. 

v,^9K7ide?lfleS  "be3C"  subse"  of  independent 
variables  in  terms  of  a  criterion  such  as  R2 
adjusted  R  or  Mallows'  C  (described  in  P9R  program 
description).  It  also  ilentifies  alternative  lood 
subsets  of  the  independent  variables.  P9R  computes 
only  a  small  fraction  of  all  possible  regressions  to 
rind    the   numerically  best   subset. 

nr  JJf.  "'I"'66,  Pr°8ram3  Pri"  "d  plot  residuals  and 
predicted  values.  The  plots  are  useful  in  detecting 
lack  of  .linearity,  heteroscedasticity  (lack  of 
constant  variance),  unusual  outliers,  gross  errors 
an  unusual  subpopulation  that  should  be  separated 
from  the  analysis,  etc.  The  plots  may  also  indicate 
that  transformations  of  the  data  are  necessary  or 
that   an   inappropriate   model  was   chosen. 

The  residual  analysis  in  P9R  is  the  most 
extensive  of  the  three.  P9R  also  allows  easy 
cross-validation    of    the    regression    model    by    testing 


2.2  Data  Analysis 


It  on  a  subset  of  the  cases  excluded  from  the 
analysis . 

P4R  creates  new  independent  variables,  called 
principal  components,  that  are  linear  combinations 
of  the  original  independent  variables-  These 
principal  components  are  determined  in  a  way  that 
provides  a  parsimonious  summary  of  the  original 
variables;  a  subset  of  the  principal  components 
explains  most  of  the  tot,al  variance  of  the  original 
set  of  independent  variables.  The  program  then 
regresses  the  dependent  variable  in  a  stepwise 
manner  on  the  principal  components;  not  all  the 
principal  components  may  be  used,  but  useful 
information  is  based  on  all  the  variables.  The 
regression  equations  at  each  step  are  expressed  in 
terms  of  the  principal  components  and  the  original 
variables. 

The  relation  between  an  independent  and  a 
dependent  variable  may  require  terms  with  higher 
powers.  The  model  for  polynomial  regression  in  P5R 
is 

y  -  B„  +  BjX  +  82x2  +  ...  +6kxk  +  e  . 

P5R  reports  polynomials  of  degree  one  through  a 
specified  degree;  this  helps  to  determine  the 
highest-order  equation  necessary  for  an  adequate  fit 
of  the  data.  As  higher-order  terms  are  introduced 
into  the  model,  the  fitted  regression  curve  and  the 
original  data  can  be  plotted  at  each  step  for  a 
visual  check  on  how  the  fit  is  proceeding. 


Nonlinear  Regression 

To  fit  a  model  where  the  equation  is  not  linear 
in  the  parameters  you  can  use  the  nonlinear 
regression  programs,  P3R  and  PAR.  These  are  least 
squares  programs  appropriate  for  a  wide  variety  of 
problems  that  are  not  well-represented  by  equations 
with  linear  parameters.  Several  different  functions 
are  available  in  P3R  by  simply  stating  a  number, 
including  such  functions  as  sums  of  exponentials 


P,t 


+  Pje 


p„e 


ratios  of  polynomials,  a  combination  of  sine  and 
exponential  functions,  etc.  If  you  want  a  function 
different  from  those  described  in  the  P3R  program 
description,  you  can  request  it  by  FORTRAN 
statements  in  P3R  or  PAR.  In  P3R  you  must  also 
specify  the  function's  partial  derivatives. 

A  special  nonlinear  model  is  the  logistic 
function.  PLR  computes  the  maximum  likelihood 
estimates   of   the   parameters   of 


Sx 


EC*)      - 


1  +  e 


Bx 


where  s  is  the  sum  of  the  binary  (0,1)  dependent 
variable  y  (!;  ■  i)  and  xN  represents  the 
independent  variables.  The  dependent  (outcome) 
variable  records  events  such  as  success  or  failure, 
response  or  no  response,  etc.  The  independent 
(explanatory  or  covariate)  variables  can  be 
categorical  (e.g.,  sex,  treatment,  hospital)  and 
continuous  (e.g.,  age,  height,  blood  pressure).  The 
program  generates  design  variables  for  the 
categorical   variables   and    their   interactions. 


Analysis  of  Variance  and  Covariance 

Analysis  of  variance  is  used  to  test  for 
differences  between  the  means  of  two  or  more  groups 
or  subpopulations.  In  a  simple  one-way  analysis  of 
variance  each  individual  (or  subject)  is  classified 
into  one  category  or  group  — ■  for  example,  in  a 
medical  problem  patients  could  be  assigned  to 
treatment  A,  B  or  C.  The  patients  are  grouped  by  the 
type  of  treatment.  The  model  for  this  one-way  design 
is 


*lk 


u  +  Ot  +  eik 


where  ct 


i  ' 


and  ctj  might  represent  the  effect  of 
treatments' A,  B  and  C,  respectively,  on  the 
dependent  variable,  Yik,  a  blood  pressure  reading 
for  case  k  in  group  i.  Programs  P7D,  P9D,  P1V  and 
P2V  can  be  used  to  test  the  hypothesis 

H  :   all  S  -  0 
o        1 

that  there  is  no  difference  between  treatments. 
Group  sizes  may  be  unequal  in  all  four  of  these 
programs.  For  each  dependent  variable  analyzed,  P7D 
presents  slde-by-side  histograms  that  give  an 
excellent  visual   picture   of  how  the   groups   differ. 

In  the  medical  treatment  example  above,  if  the 
covariate  x  (age)  also  affects  the  dependent 
variable    (blood   pressure),    the   one-way  model  becomes 


Yik 


U  +  »4  +  B(xik  -  x)   +  eik 


P1V  could  be  used  to  examine  treatment  effects  after 
adjusting  for  the  linear  effect  of  age.  P1V  also 
allows  multiple  covariates.  It  prints  an  analysis  of 
variance  table  with  F  tests  for  equality  of  slopes, 
zero  slopes  and  equality  of  adjusted  group  means 
(which  adjusts  for  the  effect  of  the  covariate)  and 
a   number   of   residual   plots. 

Several  factors  (or  characteristics)  may  be 
involved  in  an  analysis  of  variance  model.  In  a 
two-way  factorial  analysis  of  variance,  the 
individuals  in  each  group  are  classified  by  two 
characteristics,  such  as  sex  and  treatment.  The 
model   can  be  written 


ijk 


U+a.  +n.   +  (on)..+e.., 
i       j  ij        ijk 


Here  the  ai's  could  be  treatment  effect,  the  1  j  "» 
sex  effect  and  (chi)jj  a  possible  interaction  between 
sex  and  treatment.  P/D  can  be  used  to  analyze  these 
data.  The  accompanying  histograms  give  additional 
information. 

P2V  handles  general  fixed  effects  analysis  of 
variance  and  covariance  models.  This  program  can 
analyze  repeated  responses,  such  as  the  measurements 
of  a  subject's  blood  pressure  every  day  for  a  week. 
The  repeated  responses  are  called  trial  factors  or 
repeated  measures  factors  and  need  not  be 
statistically  independent.  In  the  blood  pressure 
example  above,  time  could  be  a  seven-level  trial 
factor  (e.g.,  a  subject's  blood  pressure  could  be 
recorded  every  day  of  the  week).  In  P2V  the  usual 
analysis  of  variance  factors,  such  as  sex  and 
treatment,  are  called  grouping  factors  to 
distinguish  them  from  trial  factors.  The  models  may 
have  only  trial  factors,  only  grouping  factors,  or 
both.  The  groups  can  contain  an  unequal  number  of 
subjects,  but  data  for  each  subject  must  include  all 
observations  over  the  trial  factor  (a  blood  pressure 
reading  must  be  given  for  each  day). 


10 


I 


Data  Analysis  2.2 


Mixed  models  are  created  by  P8V  (which  requires 
equal  cell  sizes)  and  by  P3V  (which  allows  unequal 
cell  sizes  and  covariates).  P4V  is  a  very  general 
program  that  handles  multivariate  models,  including 
those  with  repeated  measures  and  covariates.  The 
user  may  specify  cell  weights  for  use  in  the 
definition  of  model  components  such  as  main  effects 
and  lower  order  interactions  in  factorial  models  or 
specify  unequal  intervals  for  orthogonal  polynomials 
in  response  surface  analysis. 

Nonparametric  Statistics 

If  your  data  grossly  violate  the  usual  analysis 
of  variance  normality  assumptions,  you  could  try  two 
nonparametric  tests  in  P3S  —  the  Kruskal-Wallis 
one-way  analysis  of  variance  test,  or  the  Friedman 
two-way  analysis  of  variance  test.  Nonparametric 
tests  such  as  the  Mann-Whitney  0  test,  the  sign  test 
and  the  Wilcoxon  signed  rank  test  can  also  be 
computed  with  P3S.  These  tests  can  be  used  when  the 
researcher  wants  to  avoid  a  t  test  assumption  of 
normality. 


Cluster  Analysis 

Although  many  research  studies  involve 
multivariate  observations  (many  variables  observed 
for  each  case),  sometimes  little  is  known  about  the 
inter-relations  between  variables,  between  cases,  or 
between  variables  and  cases.  In  discussing  screening 
and  data  description,  we  emphasized  that  groups  or 
subpopulations  should  be  examined;  however,  problems 
often  arise  when  groups  are  not  clearly  defined  or 
when  it  is  difficult  to  see  if  the  data  are 
structured.  Clustering  is  a  good  technique  to  use  in 
exploratory  or  early  data  analysis  when  you  suspect 
that  the  data  may  not  be  homogeneous  and  you  want  to 
classify  or  reduce  the  data  into  groups.  Clustering 
performs  a  display  function  for  multivariate  data 
similar  to  graphs  or  histograms  for  univariate  data; 
it  provides  a  multivariate  summary  —  a  description 
of  characteristics  of  clusters  instead  of  individual 
cases. 

Three  different  types  of  clustering  can  be 
performed  by  BMDP  programs:  clusters  of  variables 
(P1M),  clusters  of  cases  (P2M  and  PKM) ,  and  clusters 
of  both  cases  and  variables  (P3M).  After  deciding 
which  program  is  applicable  to  your  problem,  other 
questions  must  be  answered:  How  will  you  measure 
distances  between  objects  (variables  in  P1M,  cases 
P2M  and  PKM)?  How  will  you  use  the  distances  to 
amalgamate  or  group  the  objects  into  clusters?  How 
will  you  display  the  resulting  clusters?  The  best 
answers  to  these  questions  are  still  being 
developed;  investigators  have  their  own  preferences 
as  to  which  distance  measure  or  which  amalgamation 
procedure  is  best.  You  may  want  to  try  several 
options  given  In  the  program  descriptions  to  see 
which  one  provides  the  best  results  for  your 
problem. 

In  both  P1M  and  P2M  the  clustering  begins  by 
finding  the  closest  pair  of  objects  (in  P1M, 
columns,  or  variables;  in  P2M,  rows,  or  cases) 
according  to  the  distance  matrix  and  combining  them 
to  form  a  cluster.  The  algorithm  continues,  joining 
pairs  of  objects,  pairs  of  clusters,  or  an  object 
with  a  cluster,  until  all  the  data  are  in  one 
cluster.  These  clustering  steps  are  shown  in  the 
output  cluster  diagram,  or  tree.  The  correlation  or 


distance  matrix  can  also  be  printed  in  shaded  form 
to  display  pictorially  the  clusters. 

The  clustering  method  in  P2M  is  hierarchical.  The 
procedure  in  PKM  is  called  k-means  and  begins  with 
user-specified  clusters  or  with  all  the  data  in  one 
cluster:  at  each  step  one  cluster  is  split  into  two. 
This  procedure  is  useful  when  you  have  a  large 
number  of  cases  or  when  your  goal  is  to  divide  the 
cases  into  homogeneous  subsets.  PKM  provides  several 
ways  to  standardize  the  data  in  order  to  avoid 
problems  caused  by  scale  differences. 

The  programs  discussed  above  look  for  variables 
to  be  clustered  across  all  cases  or  for  cases  to  be 
clustered  (by  similarity)  across  all  variables. 
However,  your  data  may  include  differences  between 
cases  that  do  not  extend  across  all  the  variables, 
or  your  variables  may  not  cluster  across  all  cases. 
P3M  allows  some  of  the  variables  (columns)  to  be 
clustered  as  a  subset  of  the  cases  (rows)  and  vice 
versa.  This  clustering  by  both  cases  and  variables 
is  represented  by  a  data  matrix  in  the  form  of  a 
block  diagram;  rows  and  columns  are  permuted  and 
smaller  blocks  (submatrices)  of  similar  values 
within  the  larger  block  are  outlined.  This  gives  a 
good  visual  representation  of  patterns  of  like 
values  in  the  data  matrix  and  can  be  used  as  a 
multivariate  histogram.  P3M  is  best  suited  to  treat 
categorical  variables  that  take  on  a  small  number  of 
values- 


Multivariate  Analysis 

Cluster  analysis  is  not  appropriate  for 
expressing  complex  functional  relationships.  For 
example,  if  you  are  interested  in  describing  the 
inter-relations  among  your  variables,  factor 
analysis  may  be  better  suited  to  your  needs,  and 
discriminant  analysis  provides  functions  of  the 
variables  that  best  separate  cases  into  predefined 
groups . 

Factor  analysis.  Factor  analysis  is  useful  in 
exploratory  data  analysis.  It  has  three  general 
objectives:  to  study  the  correlations  of  a  large 
number  of  variables  by  clustering  the  variables  into 
factors,  such  that  variables  within  each  factor  are 
highly  correlated;  to  interpret  each  factor 
according  to  the  variables  belonging  to  it;  and  to 
summarize  many  variables  by  a  few  factors.  The  usual 
factor  analysis  model  expresses  each  variable  as  a 
function  of  factors  common  to  several  variables  and 
a   factor   unique    to   the  variable: 


'jlfl+a 


j/2  + 


+  a.    f     +  0, 

jm  m  j 


where 


Zi     -  the  jth  standardized  variable 

m   »  the  number  of  factors  common  to  all  the 

variables 
Uj   -  the  factor  unique  to  variable  Zj 
a-M  -  factor  loadings 
f^   ■  common  factors 

The  number  of  factors,  m,  should  be  small  and  the 
contributions  of  the  unique  factors  should  also  be 
small.  The  individual  factor  loadings,  a,i,  for  each 
variable  should  be  either  very  large  or  very  small 
so  each  variable  is  associated  with  a  minimum  number 
of  factors. 


11 


2.2  Data  Analysis 


i 


To  the  extent  that  this  factor  model  is 
appropriate  for  your  data,  the  objectives  stated 
above  can  be  achieved.  Variables  with  high  loadings 
on  a  factor  tend  to  be  highly  correlated  with  each 
other,  and  variables  that  do  not  have  the  same 
loading  patterns  tend  to  be  less  highly  correlated. 
Each  factor  is  interpreted  according  to  the 
magnitudes  of  the  loadings  associated  with  it.  The 
original  variables  may  be  replaced  by  the  factors 
with  little  loss  of  information.  Each  case  receives 
a  score  for  each  factor;  these  factor  scores  are 
computed  as: 


f i  "  bil2l  +  bi2z2  + 


+  b.  z 
ip  p 


where  b^  are  the  factor  score  coefficients.  Factor 
scores  can  be  used  in  later  analyses,  replacing  the 
values  of  the  original  variables.  Under  certain 
circumstances  these  few  factor  scores  are  freer  from 
measurement  error  than  the  original  variables,  and 
are  therefore  more  reliable  measures.  The  scores 
express  the  degree  to  which  each  case  possesses  the 
quality  or  property  that  the  factor  describes.  The 
factor  scores  have  mean  zero  and  standard  deviation 
one. 

There  are  four  main  steps  in  factor  analysis: 
first,  the  correlation  or  covariance  matrix  is 
computed;  second,  the  factor  loadings  are  estimated 
(initial  factor  extraction);  third,  the  factors  are 
rotated  to  obtain  a  simple  interpretation  (making 
the  loadings  for  each  factor  either  large  or  small, 
not  in-between);  and  fourth,  the  factor  scores  are 
computed.  P4M  provides  several  methods  for  initial 
factor  extraction  and  rotation.  You  can  specify  the 
methods  to  be  used  or  P4M  will  use  preassigned 
options.  The  results  can  be  presented  in  a  variety 
of  plots. 

P8M,  Boolean  Factor  Analysis,  is  an  alternate 
technique  when  the  variables  are  binary  or 
dichotomous. 

Canonical  correlation  analysis.  Canonical 
correlation  analysis  (P6M)  examines  the  relationship 
between  two  sets  of  variables,  and  can  be  viewed  as 
an  extension  of  multiple  regression  analysis  or  of 
multiple  correlation.  Multiple  regression  deals  with 
one  dependent  variable,  Y,  and  p  independent 
variables,  Xj  .  The  regression  problem  is  to  find  a 
linear  combination  of  the  X  variables  that  has 
maximum  correlation  with  Y.  In  canonical  correlation 
there  is  more  than  one  dependent  Y  variable  —  there 
is  a  set  of  them.  The  problem  is  to  find  a  linear 
combination  of  the  X  variables  that  has  maximum 
correlation  with  a  linear  combination  of  the  Y 
variables.  This  correlation  is  called  the  canonical 
correlation  coefficient.  Then  a  second  pair  of 
linear  combinations,  with  maximum  correlation 
between  this  pair  and  zero  correlations  with  the 
first  pair  of  linear  combinations  is  found.  The 
number  of  pairs  of  linear  combinations  of  the  X  and 
Y  sets  is  equal  to  the  number  of  variables  in  the 
smaller  set  (X  or  Y).  The  technique  can  be  used  to 
test  the  independence  of  two  sets  of  variables,  or 
to  predict  information  about  a  hard-to-measure  set 
of  variables  from  a  set  that  is  easier  to  measure. 
It  can  also  be  used  to  relate  a  combination  of 
outcome  measures  to  a  combination  of  history  or 
baseline  measures.  The  original  and  canonical 
variables  can  be  plotted  one  against  the  other  in 
scatter  plots. 


Partial  correlations  and  multivariate  regression. 
Partial  correlations  can  be  computed  in  P6R;  the 
correlation  between  each  pair  of  dependent  variables 
is  computed  after  taking  out  the  linear  effect  of 
the  set  of  independent  variables.  For  example,  if 
you  want  to  do  a  factor  analysis  on  several 
variables  (systolic  blood  pressure,  diastolic  blood 
pressure,  blood  chemistry  measurements,  income, 
etc.)  but  want  to  remove  the  linear  effect  of  two 
variables  (age  and  weight)  from  the  measurements, 
you  can  state  that  the  two  variables  (age  and 
weight)  are  independent  variables  and  the  rest  are 
dependent  variables.  The  resulting  partial 
correlation  matrix  (of  the  dependent  variables  with 
the  effects  of  age  and  weight  removed)  can  be  stored 
as  a  matrix  in  a  BHDP  File,  and  can  be  used  as  input 
in  P4M,  the  factor  analysis  program. 

P6R  can  be  used  to  regress  a  number  of  dependent 
variables  on  one  set  of  independent  variables.  This 
multivariate  regression  program  gives  you  a  separate 
regression  equation  for  each  dependent  variable, 
squared  multiple  correlation  (R  )  of  each 
independent  variable  with  all  other  Independent 
variables,  R Z  of  each  dependent  variable  with  the 
set  of  independent  variables,  and  tests  of 
significance  of  multiple  regression. 

Discriminant  analysis.  In  discriminant  analysis, 
the  cases  or  subjects  are  divided  into  groups  and 
the  analysis  is  used  to  find  classification 
functions  (linear  combinations  of  the  variables) 
that  best  characterize  the  differences  between  the 
groups.  These  functions  are  also  useful  for 
classifying  new  cases. 

P7M,  the  stepwise  discriminant  analysis  program, 
is  used  to  find  the  subset  of  variables  that 
maximizes  group  differences.  Variables  are  entered 
into  the  classification  function  one  at  a  time  until 
the  group-  separation  ceases  to  improve  notably  (this 
Is  similar  to  the  stepwise  regression  program,  P2R, 
used  to  find  a  good  subset  of  variables  for 
prediction).  P7M  is  also  used  as  a  multivariate  test 
for  group  differences  (or  multivariate  analysis  of 
variance);  Milks'  lambda  (U  statistic)  and  the  F 
approximation  to  lambda  are  printed  at  each  step  of 
the  output  for  testing  group  differences. 

A  geometrical  interpretation  of  discriminant 
analysis  can  be  given  by  plotting  each  case  as  a 
point  in  a  space  where  each  variable  is  a  dimension 
(has  an  axis).  The  points  are  projected  onto  a  plane 
or  hyperplane  selected  so  the  groups  are  farthest 
apart,  giving  a  good  visual  representation  of  how 
distinct  the  groups  are  (for  two  groups,  the  points 
(cases)  are  projected  onto  a  line  where  the  groups 
are  farthest  apart).  P7M  presents  plots  that  show 
such  a  plane.  The  X  axis  is  the  direction  where  the 
groups  have  the  maximum  spread;  the  Y  axis  shows  the 
maximum  spread  of  the  groups  in  a  direction 
orthogonal  to  the  X  axis  -  this  Is  a  plot  of  the 
canonical  variables. 

The  canonical  variables  are  related  to  canonical 
correlation  analysis,  which  finds  the  linear 
combinations  of  the  two  sets  of  variables  that  are 
most  highly  correlated.  The  first  set  contains  the 
variables  in  the  classification  function;  the  second 
set  can  be  viewed  as  dummy  variables  U3ed  to 
indicate  group  membership.  The  value  of  the  first 
canonical  variable  of  the  classification  function 
set  Is  plotted  on  the  X  axis;  the  value  of  the 
second  on  the  Y  axi3.  The  coefficients  for  these 


12 


Data  Analysis  2.2 


canonical  variables  appear  in  the  output.  The 
coefficients  for  the  second  set  (dummy  variables)  do 
not  appear  in  the  output.  The  eigenvalues  and 
canonical  correlations  for  all  canonical  variables 
and  the  canonical  variable  scores  associated  with 
the  first  and  second  canonical  variables  are  also 
reported. 

At  each  step,  P7M  uses  a  one-way  analysis  of 
variance  P  statistic  (F-to-enter)  to  determine  which 
variable  should  Jqin  the  function  next.  At  step 
zero,  the  standard  univariate  analysis  of  variance 
test  is  made  for  each  of  the  variables.  The  variable 
for  which  the  means  differ  most  13  entered  first 
into  the  classification  function.  After  step  zero, 
the  computed  F-to-enter  values  are  conditioned  on 
the  variables  already  present  in  the  function.  This 
is  like  an  analysis  of  covariance,  where  the 
previously  entered  variables  can  be  viewed  as 
covariates  and  the  nonentered  variables  are 
considered   as   dependent  variables. 

At  each  step  after  a  variable  is  entered,  the 
classification  functions  are  recomputed  including 
the  newly  entered  variable.  The  number  of 
classification  functions  is  equal  to  the  number  of 
groups.  If  you  have  six  groups,  the  values  of  all 
six  functions  are  computed  for  each  case  and  the 
values  are  used  to  compute  the  posterior 
probability;  each  case  is  assigned  to  the  group  in 
which  the  value  of  the  posterior  probability  is 
maximum.  In  multiple  group  discriminant  analysis, 
one  function  is  sometimes  stated  in  the  literature 
for  separating  each  pair  of  groups.  To  get  this 
function  from  P7M,  you  subtract  the  classification 
function  coefficients  of  the  the  first  member  from 
those  of  the  second.  At  each  step,  F  statistics  (the 
F  matrix)  that  test  the  equality  of  means  between 
each  pair  of  groups  are  given.  These  F  statistics 
are  proportional  to  Hotelling's  T2  and  the 
Mahalanobls  D  and  give  an  indication  of  which 
group  means  are  closest  together  and  which  are 
farthest  apart.  After  all  variables  have  been 
entered,  the  program  lists  the  Mahalanobls  D2  from 
each  case  to  the  center  of  each  group,  and  the 
posterior  probability  of  the  case  assigned  to  each 
group.  These  two  bits  of  information  present  a  good 
picture  of  how  well  (or  how  poorly)  each  case  has 
been   classified. 

The  discriminant  analysis  procedure  is  successful 
if  few  cases  are  classified  into  the  wrong  groups. 
If  a  large  percentage  of  the  cases  are  classified 
correctly  (if  the  posterior  probability  assigns  them 
to  their  original  group)  you  know  that  group 
differences  do  exist  and  that  you  have  selected  a 
set  of  variables  that  exhibit  the  differences.  The 
P7M  output  presents  this  classification  information 
in  a  table  of  counts  indicating  how  many  cases  from 
each  original  group  are  assigned  to  each  of  the 
possible  groups.  A  pseudo-jackknife  classification 
table  is  also  printed:  for  each  case  a 
classification  function  is  computed  with  the  case 
omitted  from  the  computations.  The  function  is  then 
used  to  classify  the  omitted  case.  This  results  in  a 
classification  with  less  bias.  (A  classification 
function  can  produce  optimistic  results  when  it  is 
used  to  classify  the  same  cases  that  were  used  to 
compute   it.) 

PLR>  Stepwise  Logistic  Regression,  provides  an 
alternative  to  the  multivariate  normal  model  of  P7M. 
When     there     are     only     two    groups,     the     all     possible 


subset  regression  program,  P9R,  prints  alternative 
functions  for  each  subset  which  may  classify  the 
cases   equally  as   well. 

Preference  Pairs.  P9M,  Linear  Scores  from 
Preference  Pairs,  is  used  to  obtain  a  linear 
function  of  one  set  of  variables  that  reproduces  the 
ordering  of  cases  as  established  by  recorded 
preferences  (stated  by  expert  judges)  between 
selected   pairs   of   cases. 


Survival  Analysis 

The  techniques  described  in  Chapter  19  are 
appropriate  when  outcome  measurements  represent  the 
time  to  occurrence  of  some  event  or  response  (e.g., 
survival  time,  or  time  to  disease  recurrence).  What 
distinguishes  the  techniques  of  this  chapter  from 
other  statistical  methodology  is  the  ability  to 
handle  censored  (incomplete)  data;  that  is,  there 
are  cases  for  which  the  response  is  not  observed  but 
the  data  (time  in  study)  are  Included  in  the 
analysis.  This  could  occur  in  a  study  of  survival, 
where  an  individual  may  remain  alive  at  the  close  of 
the  observation  period  or  may  drop  out  before  the 
end. 

P1L  estimates  the  survival  (time-to-response) 
distribution  of  individuals  observed  over  varying 
time  periods.  These  estimates  can  be  obtained 
separately  for  different  groups  of  patients;  the 
equality  of  the  distributions  for  these  groups  can 
be  tested  by  two  nonparametric  rank  tests.  Plots  of 
the  survival,  hazard  and  related  functions  can  be 
printed. 

P2L  provides  Cox  model  survival  analysis  when 
there  are  covariates.  Covariates  can  be  selected  in 
a  stepwise  manner. 


Time  Series  Analysis 

The  primary  distinguishing  feature  of  time  series 
analysis,  as  opposed  to  other  types  of  statistical 
analysis,  is  the  assumption  that  cases  of  data 
represent  measurements  or  observations  made  at 
equispaced  points  along  some  linear  dimension. 
Usually  the  underlying  linear  dimension  is  time,  as 
in  the  record  of  a  subject's  blood  pressure  taken 
every  second  over  a  period  of  time.  However  time  is 
occasionally  replaced  by  some  other  dimension.  For 
example,  the  thickness  of  thread  from  a  certain 
manufacturing  process  might  be  measured  each 
millimeter  along  the  length  of  the  thread.  This 
would  constitute  a  'time'  series  in  which  length 
replaces  time  as  the  underlying  linear  dimension. 
Nevertheless,  we  follow  conventional  terminology  and 
use  the  word  'time'. 

A  basic  goal  of  time  series  analysis  is  to 
characterize  the  way  in  which  the  data  vary  over 
time.  The  a  priori  assumption,  common  in  most  other 
types  of  statistical  analysis,  that  cases  are 
statistically  independent,  Is  here  relaxed.  We  allow 
that  cases  may  be  correlated,  assuming  that  the 
correlation  between  cases  depends  on  the  time 
interval  separating  them.  In  addition,  we  allow  for 
the  presence  of  a  trend  in  the  data.  Thus  the  trend 
of  increasing  commodity  prices  over  the  last  decade 
might  be  represented  by  a  straight  line,  or  by  an 


13 


PPigBWjpgwpfipg^^ 


m^mmmmm 


mtmmsmmm* 


2.2  Data  Analysis 


c„     fhs     estimated     trend     and 

P03^eleBMDP    package    includes    two   fJgS-fStaSS 

series    analysis:    BroPlTeltnirerie;cydooain 

approach,    while    BMDPZT    him  domain 

approach.    We    may    describe    the    '«««"*> lel0B 

approach  as  representing  ^e  data  by  -  «»P  P  _  A 
e£  sinusoidal  «»?•  "^"""".ble  the  user,  by 
SE'if'KK  ^r.S  ^panylng   printout,    to 


identify    the   groups    «£»"£  S^TS    32 

o£    the    overall   variability    of    the    da«     _    ^    o£ 

domain    approach,    we    seeK    °  .      simple    and    yet 

parametric   time  M» -odrt-   *■«   *   ■*  ^    use3 
captures    the    V«l.blig    ^    *«.  rf    Box    and 

the    iterative    model    ™llalns    p  lve      model 

Jenki.ns,      .oailitiM      •«      "'tiMtion.      and 
identification,      P3"0*"1'      "nalysi3 .     Once     a 

as.  w  ££- "- 1"-  -•"""■• 

which  are  described  in  Chapter  20. 


14 


SPSS 


THE  FOLLOWINF  FILE  IS  AND  EXAMPLE  SFSSX  BATCH  JOE 
A363/SPSSXCC 

INSERT  YOUR  CHARGE  CODE  AND  PASSWORD  AND  RUN  THE  JOB  FOR  MORE 

INFORMATION  ON  THE  USE  OF  SPSSX 


Chapter  1       Introduction 


The  SPSSXT"  Batch  System  is  a  comprehensive  tool  for  managing,  analyz- 
ing, and  displaying  data.   Its  capabilities  include 

•  Input  from  almost  any  type  of  data  file. 

•  File  management,  including  sorting,  splitting,  and  aggregating  files, 
match-merging  multiple  files,  and  saving  fully  defined  system  files. 

•  Data  management,  including  sampling,  selecting,  and  weighting  cases, 
recoding  variables,  and  creating  new  variables  using  extensive  numeric 
and  string  functions. 

•  Tabulation  and  statistical  analysis — from  describing  single  variables  to 
performing  complex  multivariate  analyses. 

•  Report  writing. 

BLH    do^SHT   PfiV€. — ►    •  Device-independent  graphics. 


±W\s 


This  manual  describes  the  operations  of  the  SPSSX  system.  It  tells  how  to 
get  results,  not  how  to  interpret  them.  If  you  are  unfamiliar  with  the 
statistical  procedures  the  system  makes  available,  you  might  begin  with 
SPSS"  Introductory  Statistics  Guide,  where  the  most  frequently  used  proce- 
dures are  presented  with  overviews  of  the  statistics  they  calculate. 

Two  types  of  information  not  included  in  this  manual  are  available 
through  the  INFO  command  documented  in  Chapter  2.  These  are  informa- 
tion specific  to  the  computer,  operating  system,  and  installation  at  which 
you  use  SPSS"  and  information  about  new  features  and  changes  to  SPSSX 
since  publication  of  this  manual.  The  INFO  command  is  both  an  integral 
part  of  SPSSX  documentation  and  a  guide  to  system  developments  avail- 
able on  your  computer. 

This  first  chapter  gives  an  overview  of  SPSSX.  The  sample  job  in 
Section  1.1  illustrates  how  individual  commands  work  together  in  an 
SPSS"  job  and  shows  some  of  the  printed  output  the  job  creates.  Sections 
1.2  and  1.3  describe  the  types  of  files  used  in  SPSS",  and  Sections  1.4 
through  1.9  review  data  definition  facilities.  Sections  1.10  through  1.14 
introduce  facilities  for  revising  and  creating  variables,  and  Sections  1.15 
through  1.19  briefly  summarize  file  management  facilities.  Sections  1.20 
through  1.23  describe  utilities  for  controlling  the  environment  of  an  SPSSX 
job,  for  printing  and  writing  cases,  and  for  generating  tables,  reports,  and 
high-quality  graphs.  Finally,  Sections  1.24  through  1.36  summarize  the 
statistical  procedures  available  in  SPSS". 

1 


SPSS 


For  information  about  the  SPSS*™  system,  SPSS®  Graphics,  SCSS™,  and 
other  software  produced  and  distributed  by  SPSS  Inc.,  please  write  or  call 

Marketing  Department 

SPSS  Inc. 

Suite  3300 

444  North  Michigan  Avenue 

Chicago,  IL  60611 

(312)329-2400 

SPSS*,  SPSS,  and  SCSS  are  the  trademarks  of  SPSS  Inc.  for  its  proprietary 
computer  software.  No  material  describing  such  software  may  be  produced 
or  distributed  without  the  written  permission  of  the  owners  of  the  trade- 
mark and  license  rights  in  the  software  and  the  copyrights  in  the  published 
materials. 

SPSSX  User's  Guide 
Copyright  ©  1983  by  SPSS  Inc. 
All  rights  reserved. 

Printed  in  the  United  States  of  America. 

No  part  of  this  publication  may  be  reproduced,  stored  in  a  retrieval  system, 
or  transmitted,  in  any  form  or  by  any  means,  electronic,  mechanical, 
photocopying,  recording,  or  otherwise,  without  the  prior  written  permis- 
sion of  the  publisher. 

1234567890  SMSM  89  8  7  6  5  4  3 
ISBN    D-QT-OilbSSD-T 

Library  of  Congress  Catalog  Card  No.:  82-062808 


Chapter  15       Combining  System  Files    225 

15.1     THE  MATCH  FILES  COMMAND    225 
15.2    Parallel  Files    226 

15.3  The  FILE  Subcommand     15.4  Specifying  the  Active  File     15.5  The  MAP  Subcommand 
15.6  The  RENAME  Subcommand     15.7  The  DROP  and  KEEP  Subcommands 
15.8  Reordering  Variables 

15.9    Nonparallel  Files    230 

15.10  The  FILE  and  BY  Subcommands     15.1 1  Common  Variables     15.12  The  IN  Subcommand 
15.13  Tables  and  Files    234 

15.14  The  FILE,  TABLE,  and  BY  Subcommands     15.15  The  FIRST  and  LAST  Subcommands 
15.16  The  FIRST  and  LAST  Subcommands  on  One  File    238 
15.17  THE  ADD  FILES  COMMAND    238 
15.18  Concatenating  Files    238 

15.19  The  FILE  Subcommand     15.20  Optional  Subcommands 
15.21  Interleaving  Files    241 

15.22  The  FILE  and  BY  Subcommands     15.23  Optional  Subcommands 

Chapter  16        File  Interfaces    245 

16.1     THE  SCSS  INTERFACE    245 

16.2  The  SAVE  SCSS  Command    245 

16.3  The  OUTFILE  Subcommand     16.4  The  KEEP  and  DROP  Subcommands 

16.5  The  RENAME  Subcommand     16.6  The  Display  Output 

16.7  The  GET  SCSS  Command    248 

16.8  The  MASTERFILE  Subcommand     16.9  The  WORKFILE  Subcommand 
16.10  The  VARIABLES  Subcommand 

16.11  TRANSPORTING  SPSSX  SYSTEM  FILES    250 

16.12  Considerations  for  Portable  Files    250 

16.13  Characteristics  of  Portable  Files    251 
16.14  Character  Translation 

16.15  The  EXPORT  Command    251 

16.16  The  KEEP  and  DROP  Subcommands     16.17  The  RENAME  Subcommand 
16.18  The  MAP  Subcommand     16.19  The  DIGITS  Subcommand 

16.20  The  IMPORT  Command    253 

16.21  The  KEEP  and  DROP  Subcommands     16.22  The  RENAME  Subcommand 

16.23  The  MAP  Subcommand 

Chapter  17       Using  Procedures  In  SPSS*    257 

17.1  WHAT  IS  A  PROCEDURE?    257 

17.2  PROCEDURE  PLACEMENT    258 

17.3  The  EXECUTE  Command    258 

17.4  The  BEGIN  DATA  and  END  DATA  Commands    259 

17.5  OPTIONS  AND  STATISTICS  COMMANDS    259 

17.6  SAVING  CASEWISE  RESULTS    260 

17.7  PROCEDURES  AND  OUTPUT  FILES    260 

17.8  The  PROCEDURE  OUTPUT  Command    260 

17.9  Matrix  Materials    261 

17.10  Writing  Matrix  Materials      17.1 1  The  INPUT  MATRIX  Command 

17.12  The  N  OF  CASES  Command      17.13  Passing  Matrix  Materials  among  Procedures 

17.14  Split-File  Processing 

Chapter  18        FREQUENCIES    265 

18.1  OVERVIEW    265 

18.2  OPERATION    266 

18.3     The  VARIABLES  Subcommand     266 

18.4  General  vs.  Integer  Mode 

18.5  The  FORMAT  Subcommand    267 

18.6  Table  Formats      18.7  The  Order  of  Values      18.8  Suppressing  Tables      18.9  Index  of  Tables 
18.10  Writing  Tables  to  a  File 

18.11  Bar  Charts  and  Histograms    269 

18.12  The  BARCHART  Subcommand      18.13  The  HISTOGRAM  Subcommand 
18.14  The  HBAR  Subcommand 

XV 


wmammm 


18.15  Percentiles  and  Ntiles    273 

18.16  The  PERCENTILES  Subcommand      18.17  The  NTILES  Subcommand 

18.18  The  STATISTICS  Subcommand    276 

18.19  Missing  Values    276 

18.20  LIMITATIONS    277 

Chapter  19        CONDESCRIPTIVE    279 

19.1  OVERVIEW    279 

19.2  OPERATION    279 

19.3  The  Variable  List    279 

19.4  Statistics    280 

19.5  Z  Scores    281 

19.6  Missing  Values     282 

19.7  Formatting  Options    282 
19.8     LIMITATIONS    284 

Chapter  20        CROSSTABS    287 

20.1  OVERVIEW    287 

20.2  OPERATION    288 

20.3  General  Mode     288 

20.4  Integer  Mode    291 

20.S  The  VARIABLES  Subcommand     20.6  The  TABLES  subcommand 
20.7     Cell  Contents     293 

20.8  Percentages    20.9  Expected  Values  and  Residuals 

20.10  Optional  Statistics     294 

20.11  Missing  Values     295 

20.12  Formatting  Options    297 

20.13  Indexing  Tables     297 

20.14  Writing  and  Reproducing  Tables     297 

20.15  Writing  Tables  to  a  File    20.16  The  Output  File    20.17  Reproducing  Tables 
20.18  LIMITATIONS    299 


Chapter  21        MULT  RESPONSE    303 


21.1     INTRODUCTION  TO  MULTIPLE  RESPONSE  ITEMS    303 

21.2  Constructing  Group  Variables     304 

21.3  Crosstabulations     305 

21.4  OVERVIEW    307 

21.5  OPERATION     307 

21.6  The  GROUPS  Subcommand    308 

21.7  The  VARIABLES  Subcommand    309 

21.8  The  FREQUENCIES  Subcommand    309 

2 1 .9  The  TABLES  Subcommand     3 1 1 
21.10  Paired  Crosstabulations 

21.11  Cell  Contents  and  Percentages    315 

21.12  Missing  Values     315 

21.13  Formatting  Options     316 

21.14  Stub  and  Banner  Tables     316 
21.15  LIMITATIONS     317 


Chapter  22        BREAKDOWN    321 


22.1  OVERVIEW    321 

22.2  OPERATION     321 

22.3  General  Mode     322 

22.4  Integer  Mode     324 

'2  7  The  cSRHFSit°mmandn     2,26  The  TABLES  Subcommand 
i-i-i   lne  LROSSBREAK  Alternate  Display  Format 

svi 


22.8  Optional  Statistics     326 

22.9  Missing  Values    327 

22.10  Formatting  Options     328 

22.11   LIMITATIONS     329 


Chapter  23        REPORT    333 


23.1     INTRODUCTION     333 
23.2     Page  Layout     334 

23.3  Columns    23.4  Rows 

23.5  Breaks  and  Break  Variables    335 

23.6  Command  Overview     336 

23.7  A  Company  Report    337 

23.8     THE  FORMAT  SUBCOMMAND    338 

23.9  The  LIST  Keyword    338 

23.10  Page  Dimensions     339 

23.11  Vertical  Spacing    339 

23.12  The  MISSING  Keyword     339 

23.13  FORMAT  Summary    339 

23.14  THE  VARIABLES  SUBCOMMAND    340 

23.15  Column  Contents    340 

23.16  Column  Widths    341 

23.17  Column  Heads    341 

23.18  Positioning  Columns  under  Heads    342 

23.19  Intercolumn  Spacing    342 

23.20  VARIABLES  Summary     343 
23.21  THE  STRING  SUBCOMMAND    346 

23.22  Variables  within  Strings    346 

23.23  Literals  within  Strings    347 

23.24  Using  Strings    347 

23.25  STRING  Specifications    347 

23.26  The  Company  Report  Using  Strings     348 
23.27  THE  BREAK  SUBCOMMAND    349 

23.28  Column  Heads,  Contents,  and  Width    349 

23.29  One-  and  Two-Break  Reports  with  Two  Variables 

23.30  Keyword  (TOTAL)  and  Multiple  Break  Reports     23.31  Reports  with  No  Breaks 

23.32  BREAK  Summary    354 
23.33  THE  SUMMARY  SUBCOMMAND     354 

23.34  Basic  Specifications     354 

23.35  REPORT  Statistics     355 

23.36  Composite  Functions    355 

23.37  Multiple  Aggregate  Functions     355 

23.38  Summary  Titles    356 

23.39  Spacing  Summary  Lines     357 

23.40  Summary  Titles  in  Break  Columns     23.41   Print  Formats  for  Summaries     23.42  Using  Composite 
Functions     23.43  N'ested  Composite  Functions 

23.44  Multiple  Summary  Statistics  on  One  Line     364 

23.45  Repeating  Summary'  Specifications     365 

23.46  SUMMARY  Summary    366 

23.47  TITLES  AND  FOOTNOTES     367 

23.48  THE  MISSING  SUBCOMMAND    368 

23.49  REPORTS  WITH  NO  BREAKS     369 

23.50  SUBCOMMAND  ORDER    370 

23.51  Limitations    370 

23.52  Trial  Runs     371 

23.53  Split-File  Processing    371 

23.54  Sorting  Cases     371 

23.55  REPORT  Compared  with  Other  Procedures     372 

23.56  Producing  CROSSBREAK-like  Tables     23.57  Producing  CROSSTABS-like  Tables 
23.58  REPORT  and  Other  SPSS*  Commands     374 


XVII 


Chapter  25        T-TEST    431 


25. 1  OVERVIEW    431 

25.2  OPERATION    431 

25.3     Independent  Samples    431 

25.4  The  GROUPS  Subcommand    25.5  The  VARIABLES  Subcommand 

25.6  Paired  Samples    434 

25.7  Independent  and  Paired  Designs     435 

25.8  One-Tailed  Significance  Levels    435 

25.9  Missing  Values    435 

25.10  Formatting  Options    436 
25.11  LIMITATIONS    436 


Chapter  26        ANOVA    439 


26.1  OVERVIEW    439 

26.2  OPERATION    440 

26.3     Specifying  Full  Factorial  ANOVA  Models     440 
26.4  Cell  Means 

26.5  Suppressing  Interaction  Effects     443 

26.6  Specifying  Covariates     443 

26.7  Order  of  Entry  of  Covariates     26.8  Regression  Coefficients  for  the  Covariates 

26.9  Methods  for  Decomposing  Sums  of  Squares    444 

26.10  Summary  of  Analysis  Methods    448 

26.11  Multiple  Classification  Analysis    449 

26.12  Missing  Values    450 

26.13  Formatting  Options     450 

26.14  LIMITATIONS    450 

Chapter  27        ONEWAY    453 

27.1  OVERVIEW    453 

27.2  OPERATION     454 

27.3  Specifying  the  Design    454 

27.4  The  POLYNOMIAL  Subcommand    454 

27.5  The  CONTRAST  Subcommand     455 

27.6  The  RANGES  Subcommand    456 

27.7  User-Specified  Ranges     27.8  Harmonic  Means 

27.9  Optional  Statistics    459 

27.10  Missing  Values     459 

27.11  Formatting  Options    459 

27.12  Matrix  Materials    460 

27.13  Writing  Matrices     27.14  Reading  Matrices 

27.15  LIMITATIONS    461 

Chapter  28        MANOVA:  General  Linear  Models    465 

28.1  OVERVIEW    465 

28.2  OPERATION     466 

28.3     The  MANOVA  Specification    466 

28.4  Dependent  Variable  List      28.5  Factor  List      28.6  Covariate  List 

28.7  The  ANALYSIS  Subcommand    468 

28.8  The  DESIGN  Subcommand    469 

28.9  Simple  Main  Effects      28.10  Interaction  Terms      28.11  Single-Degree-of-Freedom  Effects 
28.12  Kevword  CONTIN      28.13  Interactions  Between  Factors  and  Interval  Variables 

28.14  Nested  Designs      28.15  Lumped  Effects      28.16  Kevword  CONSPLUS      28.17  Error  Terms 
28.18  Keyword  CONSTANT      28.19  Keyword  MWITHIN 

28.20  The  WSFACTORS  Subcommand    473 

28.21  The  WSDESIGN  Subcommand    474 

28.22  The  ANALYSIS  Subcommand  for  Repeated  Measures  Designs    475 


XIX 


28.23  The  MEASURE  Subcommand    476 

28.24  The  TRANSFORM  Subcommand    476 

2825  Keyword  REPEATED      28.26  Keyword  POLYNOMIAL      28.27  Keyword  SPECIAL 
28.28  Multiple  Variable  Lists 

28.29  The  RENAME  Subcommand    479 

28.30  The  METHOD  Subcommand    480 

28.31  Keyword  MODELTYPE      28.32  Keyword  ESTIMATION      28.33  Keyword  SSTYPE 

28.34  The  PARTITION  Subcommand    481 

28.35  The  CONTRAST  Subcommand    482 

28.36  The  SETCONST  Subcommand    486 

28.37  The  ERROR  Subcommand    486 

28.38  The  PRINT  and  NOPRINT  Subcommands    487 

28.39  Keyword  CELLINFO     28.40  Keyword  HOMOGENEITY     28.41  Keyword  DESIGN 
28.42  Keyword  PRINCOMPS      28.43  Keyword  ERROR      28.44  Keyword  SIGNIF 
28.45  Keyword  DISCRIM     28.46  Keyword  PARAMETERS     28.47  Keyword  OMEANS 
28.48  Keyword  PMEANS      28.49  Keyword  POBS      28.50  Keyword  TRANSFORM 
28.51  Keyword  FORMAT 

28.52  The  PLOT  Subcommand    493 

28.53  Matrix  Materials    497 

28.54  The  WRITE  Subcommand     28.55  The  READ  Subcommand 

28.56  Missing  Values    499 
28.57  EXAMPLES  OF  COMMON  DESIGNS    499 

28.58  Univariate  Analysis  of  Variance    499 

28.59  Specifying  a  Model  with  the  DESIGN  Subcommand     28.60  Specifying  the  ERROR  Term 
28.61  Using  DESIGN  and  ERROR     28.62  Partitioning  the  Sum  of  Squares     28.63  Contrasts 

28.64  Randomized  Block  Designs    501 

28.65  Complete  Randomized  Block  Designs 

28.66  Balanced  Incomplete  (Randomized)  Block  Designs  (BIB) 

28.67  Partially  Balanced  Incomplete  Block  Designs  (PBIB) 

28.68  Latin  and  Other  Squares    502 

28.69  Nested  Designs    502 
28.70  MANOVA  EXAMPLES    509 

28.71  Example  I:  Analysis  of  Covariance  Designs    509 

28.72  Example  2:  Multivariate  One-Way  ANOVA    513 

28.73  Example  3:  Multivariate  Multiple  Regression,  Canonical  Correlation    519 

28.74  Example  4:  Repeated  Measures    529 

28.75  Example  5:  Repeated  Measures  with  a  Constant  Covariate    527 

28.76  Example  6:  Repeated  Measures  with  a  Varying  Covariate    531 

28.77  Example  7:  A  Doubly  Multivariate  Repeated  Measures  Design    532 

28.78  Example  8:  Profile  Analysis    536 


Chapter  29        LOGL1NEAR    541 


29.1  OVERVIEW    541 

29.2  OPERATION    542 

29.3    The  LOGLINEAR  Specification    542 

29.4  The  Logit  Model     29.5  Specifying  Covariates 

29.6    The  DESIGN  Subcommand    544 

29.7  Specifying  Main  Effects  Models      29.8  Specifying  Interactions:  Keyword  BY 
29.9  Specifying  Covariates      29.10  Single-Degree-of-Freedom  Partitions 

29.11  The  CWEIGHT  Subcommand    545 

29.12  The  GRESID  Subcommand    546 

29.13  The  PRINT  and  NOPRINT  Subcommands    547 

29.14  The  PLOT  Subcommand    547 

29.15  The  CONTRAST  Subcommand    548 

29.16  Contrasts  for  a  Multinomial  Logit  Model      29.17  Contrasts  for  a  Linear  Logit  Model 
29.18  Contrasts  for  a  Logistic  Regression  Model 

29.19  The  CRITERIA  Subcommand    554 

29.20  The  WIDTH  Subcommand    554 

29.21  Missing  Values    554 
29.22  LOGLINEAR  EXAMPLES    555 

29.23  Example  1:  A  General  Log-linear  Model    555 

29.24  Example  2:  A  Multinomial  Logit  Model     558 

29.25  Example  3:  Frequency  Table  Models     558 

29.26  Example  4:  A  Linear  Logit  Model     560 

XX 


h 


29.27  Example  5:  Logistic  Regression  Model     562 

29.28  Example  6:  Multinomial  Response  Models     564 

29.29  Example  7:  A  Distance  Model     567 


Chapter  30        SCATTERGRAM    571 


30.1  OVERVIEW    571 

30.2  OPERATION    571 

30.3     Specifying  the  Design    572 

30.4  Default  Scatterplol 
30.5     Scaling     573 

30.6  Setting  Bounds     30.7  Integer  Scaling 

30.8  Optional  Statistics    576 

30.9  Missing  Values     577 

30.10  Formatting  Options    577 

30.11  Random  Sampling    577 
30.12  LIMITATIONS    577 


Chapter  31 

PEA 

RSONCORR    579 

31.1 

OVERVIEW    579 

31.2 

OPERATION     579 

31.3  Specifying  the  Design     580 

31.4  Two-Tailed  Significance  Levels     581 

31.5  Optional  Statistics     581 

31.6  Missing  Values     584 

31.7  Formatting  Options    584 

31.8  Writing  Matrix  Materials    585 

31.9 

LIMITATIONS    586 

Chapter  32 

PARTIAL  CORR    589 

32.1 

OVERVIEW     589 

32.2 

OPERATION    590 

590 
32.4  Correlation  List     32.5  Control  List  and  Order  Values     32.6  Specifying  Multiple  Analyses 

32.7  Two-Tailed  Significance  Levels     593 

32.8  Optional  Statistics     593 

32.9  Missing  Values     594 

32.10  Formatting  Options     594 

32.11  Matrix  Materials    595 

32.12  Reading  Matrices     32.13  Indexing  Matrices     32.14  Writing  Matrices 

32.15  LIMITATIONS     597 


Chapter  33        REGRESSION    601 


33.1  OVERVIEW    601 

33.2  OPERATION     602 

33.3     Minimum  Required  Syntax    602 

33.4  The  VARIABLES  Subcommand      33.5  The  DEPENDENT  Subcommand 

33.6  The  Method  Subcommands 
33.7    VARIABLES  Subcommand  Modifiers    606 

33.8  The  MISSING  Subcommand      33.9  The  DESCRIPTTVES  Subcommand 

33.10  The  SELECT  Subcommand 
33.11  Equation  Control  Modifiers    608 

33.12  The  CRITERIA  Subcommand      33.13  The  STATISTICS  Subcommand 

33.14  Regression  through  the  Origin 


XXI 


33.15  Analysis  of  Residuals    612 

33.16  Temporary  Variables     33.17  The  RESIDUALS  Subcommand 

33.18  The  CASEWISE  Subcommand     33.19  The  SCATTERPLOT  Subcommand 

33.20  The  PARTIALPLOT  Subcommand     33.21  The  SAVE  Subcommand 

33.22  Matrix  Materials    619 

33.23  The  READ  Subcommand     33.24  The  WRITE  Subcommand 
33.25  The  WIDTH  Subcommand    621 


Chapter  34        DISCRIMINANT    623 


34.1  OVERVIEW    623 

34.2  OPERATION    624 

34.3  The  GROUPS  Subcommand    624 

34.4  The  VARIABLES  Subcommand    625 

34.5  The  ANALYSIS  Subcommand    626 

34.6  Variable  Selection    627 

34.7  The  METHOD  Subcommand     34.8  Inclusion  Levels     34.9  The  MAXSTEPS  Subcommand 
34.10  Statistical  Controls 

34.11  The  FUNCTIONS  Subcommand    632 

34.12  Optional  Statistics  for  the  Analysis  Phase    633 

34.13  The  SELECT  Subcommand    634 

34.14  Rotation  Options    634 

34.15  Display  Options    634 

34.16  Classifying  Cases    634 

34.17  The  PRIORS  Subcommand     34.18  The  Classification  Results  Table     34.19  Classification  Plots 
34.20  Printed  Discriminant  Scores     34.21  Classification  Options 
34.22  Using  Classification  Coefficients 

34.23  Missing  Values    641 

34.24  The  SAVE  Subcommand    641    - 

34.25  Matrix  Materials    642 

34.26  Writing  Matrices     34.27  Reading  Matrices 
34.28  Summary  of  Syntax  Rules    644 
34.29  LIMITATIONS    645 


Chapter  35        FACTOR    647 


35.1  OVERVIEW    647 

35.2  OPERATION    648 

35.3    The  Variable  Selection  Block    648 

35.4  The  VARIABLES  Subcommand     3S.S  The  MISSING  Subcommand 
35.6  The  WIDTH  Subcommand 

35.7    The  Extraction  Block    649 

35.8  The  ANALYSIS  Subcommand      35.9  The  EXTRACTION  Subcommand 

35.10  The  PRINT  Subcommand     35.11  The  FORMAT  Subcommand     35.12  The  PLOT  Subcommand 

35.13  The  CRITERIA  Subcommand     35.14  The  DIAGONAL  Subcommand 

35.15  The  ROTATION  Subcommand    654 

35.16  The  SAVE  Subcommand    655 

35.17  Matrix  Materials    660 

35.18  The  READ  Subcommand      35.19  The  WRITE  Subcommand 
35.20  LIMITATIONS  AND  SUMMARY  OF  SYNTAX    661 


Chapter  36        NONPAR  CORR    663 


36.1  OVERVIEW    663 

36.2  OPERATION     663 

36.3  Specifying  the  Design    664 

36.4  Types  of  Coefficients     665 

36.5  Two-Tailed  Significance  Tests    665 

36.6  Missing  Values    665 

36.7  Formatting  Options     668 

36.8  Random  Sampling     668 

36.9  Writing  Matrix  Materials    669 
.  36.10  LIMITATIONS     669 

xxii 


Chapters?        N  PAR  TESTS    671 


37.1  INTRODUCTION  TO  NONPARAMETRIC  TESTS    671 

37.2  OVERVIEW    671 

37.3  OPERATION    672 

37.4     One-Sample  Tests    672 

37.5  One-Sample  Chi-Square  Test     37.6  Kolmogorov-Smirnov  One-Sample  Test      37.7  Runs  Test 

37.8  Binomial  Test 
37.9    Tests  for  Two  Related  Samples    678 

37.10  McNemar  Test     37.11  Sign  Test     37.12  Wilcoxon  Matched-Pairs  Signed-Ranks  Test 
37.13  Tests  for  k  Related  Samples    682 

37.14  Cochran  0  Test     37.15  Friedman  Test     37.16  Kendall  Coefficient  of  Concordance 
37.17  Tests  for  Two  Independent  Samples    684 

37.18  Two-Sample  Median  Test     37.19  Mann-Whitney  V  Test 

37.20  Kolmogorov-Smirnov  Two-Sample  Test     37.2!  Wald-Wolfowitz  Runs  Test 

3722  Moses  Test  of  Extreme  Reactions 

37.23  Tests  for  it  Independent  Samples    691 

37.24  *-Sample  Median  Test     37.25  Kruskal-Wallis  One-Way  Analysis  of  Variance 

37.26  Optional  Statistics    694 

37.27  Missing  Values    694 

37.28  Random  Sampling    695 

37.29  Aliases  for  Subcommand  Names    695 
37.30  LIMITATIONS  FOR  NPAR  TESTS    695 


Chapter  38        BOX-JENKINS    697 


38.1  OVERVIEW    697 

38.2  OPERATION    698 

38.3  The  VARIABLE  Subcommand    698 

38.4  Step-of-Analysis  Subcommands    698 
38  5  Plotting  the  Series 

38.6    Transformation  Subcommands    700 

38.7  The  LOG  and  POWER  Subcommands 
38.8     Differencing  Subcommands    701 

38.9  The  DIFFERENCE  Subcommand     38.10  The  SDIFFERENCE  and  PEklOD  Subcommands 

38.11  The  LAG  Subcommand    702 

38.12  Parameters  Subcommands    703 

38.13  Estimation  Subcommands    704 

38.14  Keywords  CONSTANT  and  NCONSTANT      38.15  Keywords  CENTER  and  NCENTER 

38.16  The  ITERATE  Subcommand      38.17  The  BFR  Subcommand      38.18  Keywords  TEST  and  NTEST 

38.19  The  FPR  Subcommand     38.20  fcrturbation  Increment  Subcommands 

38.21  Tolerance  Subcommands     38.22  Initial  Estimates  Subcommands 

38.23  Forecast  Subcommands    712 

38.24  The  ORIGIN  Subcommand     38.25  The  LEAD  Subcommand     38.26  The  CIN  Subcommand 
38.27  Final  Estimates  Subcommands 

38.28  The  PRINT  Subcommand    714 

38.29  The  PLOT  Subcommand    714 


Chapter  39        RELIABILITY    717 


39.1  INTRODUCTION  TO  RELIABILITY  MODELS    717 

39.2  OVERVIEW    718 

39.3  OPERATION    719 

39.4  The  VARIABLES  Subcommand     719 

39.5  The  SCALE  Subcommand    719 

39.6  The  MODEL  Subcommand    720 

39.7  Optional  Statistics    721 


XXIII 


WATER  QUALITY  MONITORING 
PROGRAMS 


DEVELOPED  BY 

Stanley  L.  Ponce 


WSDG  Technical  Paper 
WSDG-TP-00002 
December  1980 

Watershed  Systems  Development  Group 
USDA  Forest  Service 
3825  East  Mulberry  Street 
Fort  Collins,  Colorado  80524 


PREFACE 

Recent  legislation,  such  as  Public  Law  92-500  (the  Federal  Water 
Pollution  Control  Act  Amendments  of  1972),  RPA  and  NFMA,  and  public  opinion 
have  forced  water  quality  considerations  to  surface  in  many  land  and 
resource  decision  processes.  This  has  generated  a  need  to  provide 
decision-makers  with  information  about  existing  water  quality  and  the 
impacts  of  land  management  practices  on  water  quality.  In  general,  this 
information  is  obtained  through  water  quality  monitoring. 

Water  quality  monitoring,  which  is  defined  in  the  Forest  Service 
Manual  as  "the  systematic  evaluation  of  achievement  of  water  quality 
management  goals,  objectives,  or  targets,"  is  usually  the  responsibility  of 
the  forest  hydrologist.  The  purpose  of  this  Technical  Paper  is  to  help 
forest  hydrologists  develop  technically  sound  water  quality  monitoring 
programs.  The  material  presented  here  is  the  result  of  an  extensive 
literature  review  and  personal  experience. 

It  is  intended  that  this  paper  be  used  as  a  technical  guide,  not  a 
"cook  book."  Every  water  quality  monitoring  program  will  be  different.  As 
a  result,  each  program  will  require  that  the  hydrologist  understand  the 
hydrologic  system  at  hand  as  well  as  the  interaction  between  land-use 
activities  and  water  quality.  In  my  opinion,  there  is  no  substitute  for 
careful  planning  by  the  professional  forest  hydrologist  when  developing  a 
water  quality  monitoring  plan  of  operation  for  a  National  Forest. 

This  paper  was  designed  to  be  used  in  conjunction  with  Watershed 
Systems  Development  Group  (WSDG)  Technical  Paper  00001,  "Statistical 
Methods  Commonly  Used  in  Water  Quality  Data  Analysis";  and  WSDG  Application 
Documents  00001,  "Statistical  Analysis  Using  the  Statistical  Analysis 
System  (SAS)  at  the  EPA  National  Computer  Center";  and  00002,  "Statistical 


Analysis  Using  the  Statistical  Package  for  the  Social  Sciences  (SPSS)  at 
the  USDA  Fort  Collins  Computer  Center." 

I  would  like  to  acknowledge  all  the  following  people  who  reviewed  this 
paper  and  provided  many  valuable  suggestions  for  its  improvement:  Mr.  John 
Potyondy,  USDA  Forest  Service;  Dr.  David  W.  Schindler,  Fisheries  and 
Environment  Canada;  Dr.  Robert  C.  Averett,  USGS-WRD;  Dr.  Robert  Beschta, 
Oregon  State  University;  Mr.  Karl  Gebhardt,  BLM;  Dr.  Ken  Brooks,  University 
of  Minnesota;  Mr.  David  Ryn,  USDA  Forest  Service;  Dr.  Walt  Hivner,  Colorado 
State  University;  Dr.  David  DeWalle,  Pennsylvania  State  University;  Dr. 
Clarence  Skau,  University  of  Nevada;  Mr.  Ronald  Russell,  USDA  Forest 
Service;  Mr.  Owen  Williams,  USDA  Forest  Service;  Mr.  Rhey  Solomon,  USDA 
Forest  Service;  Mr.  Larry  Schmidt,  USDA  Forest  Service;  Mr.  Andrew  Leven, 
USDA  Forest  Service;  Mr.  Dallus  Hughes,  USDA  Forest  Service;  Mr.  Keith 
McLaughlin,  USDA  Forest  Service;  Mr.  Harry  Parrott,  USDA  Forest  Service, 
Mr.  Ted  Beauvais,  USDA  Forest  Service;  Ms.  Ann  Puffer,  USDA  Forest  Service; 
and  Mr.  Warren  Harper,  USDA  Forest  Service. 


TABLE  OF  CONTENTS 

Page 

1.0  Introduction  i 

2.0  Types  of  Monitoring  2 

Cause-and-effect  2 

Compliance  3 

Baseline  3 

Inventory  3 

3.0  Defining  Problem  Areas  and  Setting  Study  Objectives  4 

4.0  Reviewing  Past  Work  3 

5.0  Thinking  About  Data  Analysis  .1.1 

6.0  Where,  What  and  When  13 

6.1  Guidelines  for  Locating  Sampling  Stations  14 

6.1.1  Station  Location  as  Influenced  by  the  Type  of  Monitoring     14 

6.1.2  Station  Location  as  Influenced  by  the  Water  Type           21 

6.2  Selecting  Water  Quality  Constituents  34 

6.3  Guidelines  for  Determining  Sampling  Frequency  36 

6.3.1  Systematic  Sampling  37 

6.3.2  Simple  Random  Sampling  38 

6.3.3  Stratified  Random  Sampling  47 

7.0  Guidelines  for  Collecting  and  Handling  of  Water  Quality 

Samples  55 

7.1  Types  of  Samples  56 

7.1.1  Grab  Samples  56 

7.1.2  Composite  Samples  56 

7.2  Sample  Collection  57 

7.3  Sample  Handling  59 


TABLE  OF  CONTENTS 
(continued) 

Page 

8.0  Literature  Cited  64 

Appendix 


LIST  OF  FIGURES 


Page 

Figure  1  -  Example  of  Station  Location  for  Cause-and-Effect 

Monitoring  Study  Where  the  Treatment  can  be  Readily 
Isolated  15 

Figure  2  -  Hypothetical  Rating  Curves  of  Suspended  Solids 
(log  Qss)  versus  flow  (log  Qw)  for 
Stations  A  and  B.  16 

Figure  3  -  A  Paired-station  Plot  for  Suspended  Solids 

Concentration  17 

Figure  4  -  Sample  Station  Location  for  the  Paired  Watershed 

Approach  1° 

Figure  5  -  A  Plane  View  of  Sampling  Station  Location  at  a 

Swimming  Beach  Along  a  Lake  20 


Figure  6  -  Sampling  Station  Location  for  Two  Cases,  I  and  II, 
in  Which  a  Point  Source  Effluent  is  Draining  into 
a  Stream 


Figure  9  -  Examples  of  Transect  and  Grid  Sampling  Schemes 


Figure  11  -  Illustration  of  Sample  Locations  Along  the 
Depth  Profile  in  a  Stratified  Lake 


20 


Figure  7  -  Example  of  Sampling  Station  Location  for  a  Cause- 
and-Effect  Monitoring  Study  in  Which  a  Tributary 
is  Involved  23 

Figure  8  -  An  Illustration  of  Lateral  Mixing  24 


26 


Figure  10  -  The  Three  Zones  of  a  Temperature  Profile  in  a 

Stratified  Lake  26 


28 


Figure  12  -  Temperature  Profile  in  a  Lake  or  Reservoir 

During  the  Period  of  Overturn,  Either  in  the 

Spring  or  Fal 1  29 

Figure  13  -  An  Illustration  of  the  Effect  of  Wind  on  the 

Mixing  of  Water  in  the  Epilimnion  29 

Figure  14  -  A  Hypothetical  Example  of  Where  to  Locate  Sampling 
Stations  to  Monitor  Surface  Water  Quality  on  a 
Multiple  Use  Lake  30 

Figure  15  -  Location  of  Sampling  Stations  Around  a  Solid  Waste 

Disposal  Site  22 


LIST  OF  FIGURES 
(continued) 

Page 

Figure  16  -  Radial  Design  of  Observation  Wells  Around  a 

Point  Source  33 


LIST  OF  TABLES 

Page 

Table  1  -  Indexes  for  Computerized  Search  of  Water  Resources 

Literature  9 

Table  2  -  Activities  and  Concerns  -  Water  Quality  Matrix  35 

2   9 

Table  3  -  Multiplier  (M)  of  (sd/d')  to  be  Used  in  Paired 

Comparitive  Sample  Size  Calculations  (After  Potyondy, 

1977)  46 

Table  4  -  Electrical  Conductivity  Data  (ymhos/cm)  collected 

From  a  Rocky  Mountain  Stream  48 

Table  5  -  Summary  of  Special  Sampling  or  Sample  Requirements       60 


LIST  OF  EXAMPLES 


Page 

Example  1  -  Establishing  Study  Objectives  from 

Problem  Definitions  7 

Example  2a  -  Estimating  Sample  Size  for  the  Simple  Random 

Sampling  Method  41 

Example  2b  -  Estimating  Sample  Size  for  Simple  Random 

Sampling  43 

Example  3  -  Estimating  Sample  Size  for  a  Stratified 

Random  Sample  53 


WATER  QUALITY  MONITORING  PROGRAMS 

1.0  Introduction 

Designing  a  water  quality  monitoring  program  that  will  provide  useful 
information  is  an  intellectual  activity.  It  requires  a  great  deal  of 
thought  and  careful  planning.  Thinking  about  the  measurements  you  are 
going  to  make  and  why  you  are  going  to  make  them  leads  to  problem  solving. 

Just  as  a  blood  sample  gives  a  physician  insight  into  the  functions  of 
the  human  body,  a  water  sample  can  tell  a  hydrologist  a  great  deal  about 
the  complex  system  of  a  watershed.  The  quality  of  the  water  resource  is 
directly  related  to  natural  factors,  such  as  climate,  geology,  soils  and 
terrestrial  and  aquatic  vegetation;  and  man's  land-use  activities,  such  as 
timber  harvesting,  road  building,  grazing,  recreation  and  mining. 
Consequently,  to  obtain  useful  information  from  water  quality  monitoring, 
the  sampling  network  for  collection  of  data  must  be  properly  located  in 
both  time  and  space  and  the  constituents  which  are   relevant  to  the 
management  objectives  must  be  sampled.  In  addition,  if  the  monitoring  is 
to  be  cost  effective,  the  hydrologist  needs  to  evaluate,  at  the  outset  of 
the  program,  what  can  be  accomplished  with  the  resources  that  are 
available. 

The  purpose  of  this  paper  is  to  (1)  summarize  the  various  types  of 
water  quality  monitoring  commonly  carried  out  on  National  Forest  System 
lands  and  (2)  provide  a  series  of  guidelines  to  aid  you  with  problem 
definition,  establishing  study  objectives,  locating  past  work,  data 
analysis,  locating  sampling  stations,  selecting  water  quality  constituents, 
determining  sampling  frequency,  and  collecting  and  handling  samples. 


One  final  comment  before  we  begin  our  discussion  on  developing  water 
quality  monitoring  programs.  It  is  strongly  recommended  that  you  document 
your  program  in  the  form  of  a  water  quality  monitoring  plan  of  operation 
(see  FSM  2542).  A.  written  monitoring  plan  serves  several  purposes.  First, 
it  forces  you  to  clearly  define  your  problem  and  study  objectives  as  well 
as  develop  a  logical  approach  to  collecting  data  which  will  provide 
information.  Second,  it  provides  your  supervisor  and  other  interested 
parties  with  a  statement  of  the  problem  you  plan  to  address,  how  you  will 
do  it,  the  type  of  data  that  will  be  obtained,  how  the  data  will  be 
analyzed,  the  expected  knowledge  to  be  gained,  the  financial  commitment 
required,  and  when  reports  are  to  be  done.  Finally,  if  you  leave  the 
Forest  before  the  project  is  completed,  it  provides  the  next  hydrologist 
with  the  proper  framework  to  continue  the  study.  In  general,  the  structure 
of  a  water  quality  monitoring  plan  varies  from  Region  to  Region.  However, 
the  major  components  of  most  plans  are  the  topics  discussed  in  this  paper. 

2.0  Types  of  Monitoring 

In  general,  the  types  of  water  quality  monitoring  performed  on 
National  Forest  System  lands  can  be  divided  into  four  categories: 
cause-and-effect,  compliance,  baseline,  and  inventory.  A  brief  summary  of 
each  follows. 

Cause-and-effect  (project)  monitoring  is  performed  to  quantify  the 
impacts  of  specific  land  management  activities  on  water  quality.  The 
information  obtained  from  this  type  of  study  is  often  used  to  evaluate  the 
effectiveness  of  "Best  Management  Practices,"  calibrate  existing  models 
which  were  developed  at  different  locations  or  under  different  conditions, 
and  develop  and  verify  models  designed  specifically  for  the  Forest. 

2 


WBjdV 


Cause-and-effect  monitoring  is  generally  implemented  on  a  project 
level.  The  surveys  are  designed  to  deal  with  questions  about  what  happened 
and  why.  The  monitoring  is  generally  short-term,  lasting  three  years  or 
less.  Whenever  possible,  paired  sampling  is  employed  with  samples  being 
collected  before,  during  and  after  the  treatment. 

Compliance  monitoring  on  National  Forest  System  lands  is  performed 
primarily  to  protect  public  health.  It  includes  the  monitoring  of  drinking 
water  and  water  used  for  primary  contact  recreation.  The  water  quality  is 
generally  compared  with  existing  State  water  quality  standards  and  when 
these  standards  are  not  met,  corrective  action  should  be  taken  as  soon  as 
possible. 

Baseline  monitoring  is  performed  to  provide  land  managers  with 
reliable  information  on  water  quality  trends.  The  data  are  generally  used 
to  determine  if  water  quality  maintenance  and  improvement  criteria  required 
by  law  and/or  policy  are  being  met  and  for  long-term  trend  assessment.  If 
the  data  indicate  that  water  quality  degradation  is  occurring  as  a  result 
of  activities  on  the  National  Forest,  corrective  action  may  be  evaluated 
and  appropriate  action  initiated.  Water  quality  stations  associated  with 
this  type  of  monitoring  program  are  usually  located  at  strategic  points 
within  the  Forest  and  sampled  on  a  routine  basis  for  many  years. 

Inventory  monitoring  is  carried  out  to  provide  land  managers  with 
reliable  information  of  existing  water  quality  conditions.  The  data  are 
generally  used  to  provide  information  for  the  land  management  planning 
process  and  to  establish  water  quality  goals.  Usually  the  inventory  data 
are  obtained  from  existing  stations  established  for  cause-and-effect, 
compliance  and  baseline  monitoring.  However,  if  additional  stations  are 


required,  they  are  often  located  at  strategic  points  within  the  Forest  and 
sampled  intensively  for  a  short  period  of  time. 

One  of  the  keys  to  an  effective  water  quality  program  is  to  integrate 
the  various  types  of  monitoring  so  that  they  are  complementary.  Some  of 
each  type  of  monitoring  will  generally  be  carried  out  on  all  Forests. 
Enough  of  each  type  should  be  accomplished  to  characterize  the  quality  of 
the  water  resource,  to  assess  the  impacts  of  management  activities  on  water 
quality  and  to  determine  if  water  quality  standards,  goals  and  objectives 
are  being  met. 

Priorities  for  monitoring  should  be  established  because  it  is  not 
feasible  to  monitor  the  water  quality  of  all  management  activities  or  all 
water  bodies  within  the  Forest.  Variation  of  priorities  between  Forests 
will  exist  depending  on  the  existing  data  base,  management  issues  and 
concerns,  and  water  quality  management  objectives. 

3.0  Defining  Problem  Areas  and  Setting  Study  Objectives 

The  first  step  in  developing  an  effective  water  quality  monitoring 
plan  is  to  define  problem  areas.  Each  problem  definition  must  evolve  from 
the  needs  identified  by  the  line  officer  for  information  which  will  aid  in 
making  management  decisions  (Boynton,  1972).  It  is  very  important  that  the 
needs  of  the  line  officer  be  clearly  identified  since  water  quality 
monitoring  can  only  be  justified  if  it  is  done  to  address  specific  needs  of 
management  for  information.  Furthermore,  commitment  by  line  officers  to 
monitoring  programs  is  achieved  through  their  involvement  in  problem 
identification  and  setting  specific  study  objectives. 

The  role  of  the  hydrologist  in  the  problem  definition  phase  is  to  take 
the  lead  in  suggesting  specific  problem  areas  which  are  technically 

4 


feasible  and  satisfy  the  managers  needs.  The  hydrologist  has  the  technical 
expertise  and  the  familiarity  with  land  use  and  water  quality  relationships 
to  make  this  linkage.  Involvement  of  other  functional  specialists  with  an 
interest  in  water  quality,  such  as  fishery  biologists,  is  often  appropriate 
at  this  stage  to  coordinate  common  data  needs.  Interdisciplinary 
involvement  can  avoid  duplication  of  effort  and  address  a  multitude  of 
management  needs  at  one  time  (Potyondy,  1980). 

Problem  definitions  should  be  as  specific  as  possible.  A  problem 
definition,  such  as  "What  is  the  effect  of  land  use  on  the  quality  of  water 
draining  the  Routt  National  Forest?"  is  too  broad  to  be  of  much  use.  In 
this  case,  the  problem  definition  could  be  greatly  improved  if  (1)  the  land 
management  activity  of  interest  was  identified  (timber  harvesting,  mining, 
recreation,  etc.);  (2)  the  water  resource  was  specified  (stream,  lake 
and/or  ground  water);  and  (3)  the  type  of  water  quality  was  stated 
(physical,  chemical,  biological  and/or  radiological).  An  improved  problem 
definition  might  read  "What  is  the  effect  of  clearcutting  on  the  sediment 
regime  of  Trout  Creek?"  The  problem  definition  is  now  very  clear  and 
direct.  Often  times  problem  definitions  will  not  be  this  specific.  More 
often  they  are  as  follows: 

1.  A  reliable  method  to  predict  the  effect  of  clearcutting  on  the 
sediment  yield  for  the  various  stream  types  found  in  the  Forest 
is  needed. 

2.  A  simple,  reliable  approach  to  classify  lakes  by  water  quality 
within  the  Forest  is  needed. 

These  problem  statements,  broad  as  they  may  appear,  are  consistent  with  the 

water  quality  information  needed  in  the  land  management  planning  process 

and  still  provide  the  hydrologist  with  sufficient  guidance  to  formulate 

study  objectives. 

5 


Once  the  problem  areas  have  been  defined,  the  next  step  is  to 
establish  study  objectives.  This  process  should  also  be  a  mixed  effort 
between  the  hydrologist  and  the  line  officer.  The  hydrologist' s  role, 
because  of  his  technical  knowledge  of  the  watershed  system  and  land  use/ 
water  quality  interactions,  is  to  suggest  specific  monitoring  objectives 
while  the  line  officer's  role  is  to  act  as  a  sounding  board,  continually 
asking  why  and  making  sure  the  objectives  speak  only  to  his  needs  and  that 
the  plan  fits  within  the  available  resources  (Boynton,  1972).  When  the 
objectives  are  agreed  upon  by  the  hydrologist  and  line  officer,  they  should 
be  documented  in  written  form. 

Objectives  should  be  specific  statements  of  measurable  results  to  be 
achieved  within  a  stated  time  period.  In  addition,  they  should  be  specific 
enough  so  that  the  hydrologist  can  convert  them  into  statistical  hypotheses 
which  can  be  tested  with  the  data  obtained  from  the  water  quality 
monitoring  program  (more  about  this  in  Section  5.0).  Some  illustrations  of 
problem  definitions  and  related  study  objectives  are  given  in  Example  1. 

Defining  the  problem  and  setting  the  study  objectives  phase  of  the 
study  may  seem  like  a  lot  of  work  which  will  require  a  substantial  amount 
of  your  time.  It  is  and  it  does.  However,  it  is  time  very  well  spent. 
The  point  is,  if  you  have  spent  time  defining  your  objectives  and  making 
sure  that  they  are  compatible  with  management's  needs,  there  is  a  very   good 
chance  that  your  study  will  be  successful  and  provide  meaningful 
information  to  the  land  manager. 


Example  1 
Establishing  study  objectives  from  problem  definitions, 


Case  A. 

Problem  Definition: 

Does  the  water  at  Public  Beach  A  pose  a  health  hazard  to  primary 
contact  recreationists? 

Study  Objective: 

To  determine  if  the  water  at  Public  Beach  A  meets  the  State 
standards  for  swimming  during  the  summer  of  1980. 

In  this  cases  the  strategy  is  to  monitor  the  water  quality  at  Swimming 
Beach  A  over  the  summer  and  compare  it  with  the  State  standards  for  primary 
contact  recreation. 
Case  B. 

Problem  Definition: 

Is  acid  precipitation  adversely  affecting  the  productivity  of 
Agnes  Lake? 

Study  Objectives: 

1.  To  determine  the  pH  of  the  precipitation  on  a  seasonal 
basis  at  Agnes  Lake  over  the  next  five  years. 

2.  To  determine  the  seasonal  trend  of  pH,  alkalinity  and 
conductivity  in  Agnes  Lake  over  the  next  five  years. 

3.  To  determine  the  biological  significance  of  any  change  in 
pH,  alkalinity  and  conductivity  in  Agnes  Lake  that  occurs 
over  the  next  five  years. 

In  this  case,  the  strategy  is  to  quantify  the  seasonal  input  of  acid 

(hydrogen  ions)  to  the  lake  from  precipitation,  to  develop  the  trend  of  the 

lake's  response  over  the  next  five  years,  and  determine  if  this  response  is 

biologically  significant. 


4.0  Reviewing  Past  Work 

After  the  objectives  have  been  established,  the  next  step  is  to 
determine  what  has  already  been  done.  Several  common  sources  of  data  of 
interest  to  the  wildland  hydro! ogist  are  listed  below: 

1.  Forest,  District,  and  Regional  Office  resource  reports. 

2.  U.S.  Forest  Service  research,  U.S.  Geological  Survey,  U.S. 
Environmental  Protection  Agency,  Bureau  of  Land  Management,  Water 
and  Power  Resources  Administration,  Corps  of  Engineers,  National 
Oceanic  and  Atmospheric  Administration,  and  Soil  Conservation 
Service. 

3.  State  Geological  Survey,  State  Department  of  Health,  State 
Department  of  Engineering,  and  State  Water  Pollution  Control 
Agency. 

4.  State  universities,  especially  the  departments  specializing  in 
watershed  management,  hydrology,  geology,  chemistry,  aquatic 
biology,  limnology,  and  microbiology. 

5.  River  basin  commissions. 

6.  STORET. 

In  addition  to  the  sources  mentioned  above,  several  of  the  Regions  now 
have  agreements  with  Forest  Service  research  libraries  or  other  libraries 
which  provide  computerized  literature  searches.  The  major  indexes 
presently  available  or  soon  to  be  available  are  summarized  in  Table  1. 

Most  of  the  time,  you  can  expect  that  little  if  any  data  will  be 
available  from  your  watershed  of  interest,  or  if  they  are,  they  often  will 
be  the  wrong  kinds  of  data.  You  can  sometimes  circumvent  this  problem  by 
reviewing  information  available  from  tributary  streams  or  adjacent 
drainages.  However,  you  must  be  cautious  when  transferring  data  from  one 
place  to  another. 

Whenever  data  are  available  from  your  watershed  of  interest,  they 
probably  will  have  been  collected  for  another  purpose  and  will  not  solve 
your  specific  problem.  Nevertheless,  such  data  can  provide  you  with 

3 


Table  1.  Indexes  for  computerized  search  of  water 
resources  literature  (modified  from  Busby,  1980), 


INDEX 
AGRICOLA 


AQU ALINE 


BIOSIS 
PREVIEWS 


CDI 


COMPENDIX 


GeoRef 


NTIS 


SUBJECT  AREA 


Covers  worldwide  journal  and  monographic  literature  in 
agriculture  and  related  subject  fields,  including  forestry, 
natural  resources,  chemistry  and  water  resources-  Prepared 
by  the  ILS.  National  Agriculture  Library,, 

Provides  access  to  information  on  every   aspect  of  water, 
waste  water,  and  the  aquatic  environment.  Worldwide  sources 
cited  are  400  periodicals,  research  reports,  legislation, 
conference  proceedings  and  preprints,  books,  monographs, 
pamphlets,  dissertations,  translations,  standards  and 
specifications,  and  miscellaneous  publications  from 
water-related  institutions  worldwide.  Prepared  by  the  Water 
Research  Centre. 

Includes  contents  of  Biological  Abstracts  and  Bio-Research 
Index,  covering  the  entire  life  sciences.  Citations  are 
taken  from  approximately  8,000  serial  publications,  as  well 
as  books.  Prepared  by  Biological  Sciences  Information 
Service. 


Comprehensive  Dissertation 
dissertations  accepted  for 
by  United  States  education 
universities.  Prepared  by 
International . 


Index,  containing  all 
academic  doctoral  degrees  granted 
institutions  and  some  non-U. S. 
University  Microfilms 


Covers  civil,  environmental  and  geological  engineering; 
mining,  metals,  petroleum  and  fuel  engineering;  mechanical, 
automotive,  nuclear  and  aerospace  engineering;  chemical, 
agricultural  and  food  engineering;  and  industrial 
engineering,  management,  mathematics,  physics  and 
instruments.  Prepared  by  Engineering  Index,  Inc. 

Geological  Reference  file,  covering  geosciences  literature 
from  3,000  journals,  plus  conferences  and  major  symposia  and 
monographs  in  such  areas  as  environmental  geology, 
geochemistry,  and  fluvial  geomorphology.  Prepared  by  the 
American  Geological  Institute. 

This  is  a  broad  and  cross-disciplinary  file  containing 
citations  and  abstracts  of  government-sponsored  research  and 
development  reports  and  other  government  analysis  prepared 
by  Federal  agencies  on  their  contractors  and  grantees. 
Prepared  by  National  Technical  Information  Service  of  the 
U.S.  Department  of  Commerce. 


INDEX 
POLLUTION 


SUBJECT  AREA 


WATERLIT 


WRD 


Covers  non-U. S.,  as  well  as  domestic  reports,  journals, 
contracts,  patents  and  symposia  in  the  areas  of  pollution 
control  and  research.  Prepared  by  Pollution  Abstracts,  Data 
Courier,  Inc. 

Covers  the  water  resources  and  water-related  literature  of 
the  world.  WATERLIT  topics  include,  but  are  not  limited  to, 
water  supply,  reservoirs  of  all  types,  water  utilization, 
water  standards,  limnology,  health  aspects  of  water,  water 
law  and  water  ecology.  It  is  produced  by  the  South  African 
Water  Information  Centre. 

Water  Resources  Abstracts  is  a  computerized  version  of 
Selected  Water  Resources  Abstracts,  a  semimonthly  journal 
published  by  the  Office  of  Water  Research  and  Technology. 
It  covers  literature  of  water  related  aspects  of  the  life, 
physical  and  social  sciences  as  well  as  related  engineering 
and  legal  aspects  of  the  characteristics,  conservation, 
control,  use,  or  management  of  water. 


10 


information  about  the  interactions  between  land  uses  hydrology  and  water- 
quality  and  be  very   useful  in  the  design  of  your  sampling  program. 

5.0  Thinking  About  Data  Analysis 

This  is  the  stage  of  your  study  design  when  you  should  begin  thinking 

about  how  the  data  will  be  analyzed.  You  should  start  by  converting  your 

objective  statements  into  null  (H0)  and  alternative  (Ha)  hypotheses. 

For  example,  consider  the  objective  presented  in  Case  As  Example  1.  The 

study  objective  is  a  very   specific  water  quality  concern  which  can  be 

readily  converted  into  a  set  of  null  and  alternative  hypotheses.  The 

hypotheses  to  be  tested  could  be  stated  as  follows: 

H0:    The  water  at  Public  Beach  A  does  not  exceed  the  State  water 

quality  standards  for  swimming  during  any  portion  of  the  summer 
of  1980. 

Ha:    The  water  at  Public  Beach  A  exceeds  the  State  water  quality 

standards  for  swimming  at  some  time  during  the  summer  of  1980. 

At  this  point,  we  are  ready  to  select  a  statistical  model  which  will 

allow  an  efficient  test  of  the  null  hypothesis  against  the  alternative 

hypothesis.  The  statistical  methods  that  you  select,  along  with  the 

knowledge  you  have  gained  about  the  system  through  reviewing  past  work, 

will  influence  where  you  sample,  such  as  above  or  below  a  treatment  or  at 

the  mouths  of  paired  watersheds  offering  impact  and  controlled  data 

comparisons;  and  when  and  how  often  you  sample,  such  as  once  a  season 

without  replication  or  diurnally  with  replication.  If  you  do  not  feel 

comfortable  designing  your  statistical  analysis,  you  should  review  in 

detail  WSDG  Technical  Paper  00001  ("Statistical  Methods  Commonly  Used  in 

Water  Quality  Data  Analysis",  Ponce,  1980)  and/or  seek  the  aid  of  a 

statistician. 


11 


There  are  a  few  principles  that  you  should  keep  in  mind  when  you  begin 
thinking  about  your  data  analyses.  These  have  been  summarized  from  Green 
(1979). 

1.  Carry  out  some  preliminary  sampling  to  provide  a  basis  for 
evaluation  of  sampling  design  and  statistical  analysis  options. 
Those  who  skip  this  step  because  they  do  not  have  enough  time  or 
money  usually  ending  up  loosing  both  time  and  money. 

2.  To  test  whether  a  condition  (treatment)  has  an  effect,  collect 
samples  both  where  the  condition  (treatment)  is  present  and  where 
it  is  absent  but  all  else  is  the  same.  Remember,  an  effect  can 
only  be  demonstrated  by  comparison  with  a  control. 

3.  If  possible,  take  replicate  samples  within  each  combination  of 
time,  space,  and  any  other  controlled  variable.  Differences 
among  can  only  be  demonstrated  by  comparison  to  differences 
within.  For  example,  if  you  are  comparing  NO3  yield  from  a 
clearcut  area  with  a  forested  area,  only  if  you  take  replicate 
samples  can  you  separate  sampling  error  from  differences  due  to 
the  treatment. 

4.  If  the  system  to  be  sampled  has  a  large-scale  environmental 
pattern,  break  up  the  system  into  relatively  homogeneous 
subsystems  and  allocate  samples  to  each  by  some  predetermined 
weighting  criteria.  For  example,  if  you  are  measuring  TDS  in  the 
northern  Rockies,  you  could  reduce  the  overall  variance 
substantially  if  you  broke  your  sampling  periods  into  three 
strata;  baseflow,  snowmelt,  and  stormflow;  and  weigh  each  by 
discharge. 

It  is  very   important  that  you  consider  the  statistical  analysis  at 
this  stage'  of  the  study  design.  As  Averett  (1979)  states  "problems  almost 
always  arise  when  statistical  methods  become  an  afterthought  of  study 
design  and  are  used  as  a  salvage  operation.  This  'afterthought' 
application  of  statistical  methodology  leads  to  the  deadliest  data  analysis 
trap  of  all—the  mathematical  manipulation  of  non-related,  non-correlated 
data,  into  a  probability  function." 

One  final  comment  before  we  proceed;  it  is  important  that  you  keep  the 
role  of  statistical  methods  in  proper  perspective.  Their  primary  use  is  to 
reduce  data  and  to  help  us  make  "yes"  or  "no"  statements  about  the 

12 


relation  of  samples  collected  from  different  populations*  While  there  is 
much  merit  in  designing  water  quality  sampling  studies  around  a  statistical 
framework,  it  must  be  emphasized  that  the  statistical  testing  of  data  is 
not  interpretation  of  data  (Averett,  1979).  It  is  the  responsibility  of 
the  hydrologist  to  interpret  the  results  of  the  statistical  analysis  and 
provide  the  line  officer  with  information  which  can  be  used  in  the  decision 
making  process. 

6.0  Where,  What  and  When 

At  this  stage  of  your  study  design,  you  are  ready  to  select  your 
sampling  stations  (where),  choose  the  water  quality  constituents  to  be 
sampled  at  each  station  (what),  and  determine  the  sampling  frequency  of 
each  constituent  at  each  sampling  station  (when).  This  phase  of  the  study 
design  requires  a  sound  understanding  of  the  hydrologic  system  and  how  the 
water  quality  relates  to  the  beneficial  uses  of  the  water  resource.  If  the 
study  objectives  have  been  clearly  stated  and  you  have  spent  time  thinking 
about  the  interaction  between  land  use,  hydrology,  and  water  quality  in 
your  system,  the  determination  of  where,  what,  and  when  should  be  fairly 
straightforward. 

Throughout  this  section  you  should  keep  two  points  in  mind.  First, 
where,  what,  and  when  you  sample  should  be  directly  related  to  the  needs 
and  objectives  of  the  study.  Remember,  the  line  officer  holds  you 
responsible  for  the  water  quality  data  collected  and  it  is  your  job  to  see 
to  it  that  unnecessary  data  are  not  obtained.  Second,  station  location, 
parameter  selection,  and  sampling  frequency  are  _alj_  very  important.  You 
cannot  short  cut  one  without  affecting  the  others  (Averett,  1976). 


13 


6.1  Guidelines  for  Locating  Sampling  Stations 

There  are  two  factors  which  strongly  influence  the  location  of 
sampling  stations:  (1)  the  type  of  monitoring  and  (2)  the  type  of  water 
body.  Guidelines  for  locating  sampling  stations  are  discussed  for  each  of 
these  factors  separately. 

6.1.1  Station  Location  as  Influenced  by  the  Type  of  Monitoring 

As  you  recall,  water  quality  monitoring  on  National  Forest  System 
lands  can  generally  be  classified  as  (1)  cause-and-effect,  (2)  compliance, 
(3)  baseline,  and  (4)  inventory.  Locating  the  sampling  stations  for 
cause-and-effect  monitoring  is  generally  the  easiest  to  carry  out.  The 
strategy  in  this  case  is  to  isolate  the  treatment  effects  by  (1)  sampling 
above  and  below  the  treatment  and/or  (2)  sampling  before  and  after  the 
treatment.  Consider  the  example  presented  in  Figure  1.  There  we  have  a 
treatment  which  covers  only  a  portion  of  a  small  stream.  Stations  A  and  B 
have  been  placed  immediately  above  and  below  the  treatment,  respectively, 
to  isolate  it.  Station  A  represents  the  control.  Station  B,  in  theory,  is 
assumed  to  be  similar  to  Station  A  in  all  respects  except  that  it  includes 
the  effect  of  the  treatment.  Whenever  the  "above  and  below"  approach  is 
used,  you  must  be  certain  the  above  station  is  a  satisfactory  control. 

The  type  of  sampling  design  shown  in  Figure  1  readily  lends  itself  to 
two  types  of  statistical  testing:   (1)  comparison  of  the  means  of  Stations 
A  and  B  and  (2)  comparison  of  the  regression  of  Stations  A  and  B.  If  the 
variance  of  the  water  quality  parameter  of  interest  is  not  strongly 
influenced  by  fluctuations  in  the  stream  flow,  a  simple  comparison  of  the 
means  can  be  made  to  test  for  treatment  effect.  The  hypotheses  to  be 
tested  are  as  follows: 

14 


Figure  1.  Example  of  station  location  for  cause  and  effect 
monitoring  study  where  the  treatment  can  be  readily  isolated. 


H0:   uA  =  iiB 
>V  yA  i   ^B 


where  v/\   and  vq   denote  the  mean  at  Stations  A  and  B,  respectively.  The 
statistical  method  generally  employed  to  make  this  comparison  is  the  paired 
t-test.  However,  if  the  variance  is  strongly  influenced  by  discharge,  it 
is  very  likely  that  the  treatment  effects  will  be  masked.  If  you  develop  a 
regression  of  the  water  quality  constituent  versus  discharge  (commonly 
referred  to  as  a  rating  curve)  you  can  remove  or  explain  much  of  the 
variance  due  to  flow  and  make  a  stronger  test  of  the  treatment  effect. 

A  suspended  solids  rating  curve  is  illustrated  in  Figure  2.  Note,  a 
log  X  transformation  has  been  applied  to  the  data  to  obtain  a  linear 

15 


o 


Station  A  •  •  • 
Station  B  — 


logo* 


Figure  2.  Hypothetical  rating  curves  of  suspended  solids 
(log  Qss)  versus  flow  (log  Qw)  for  Stations  A  and  B. 

regression.  This  is  usually  required  since  most  water  quality  constituents 
are  best  related  to  flow  by  a  power  function,  which  can  be  linearized  with 
a  log  X  transformation.  To  test  for  the  treatment  effect,  we  would  compare 
the  slopes  of  the  regression  lines  and  their  intercepts.  The  hypotheses  to 
be  tested  are  as  follows: 

H0:  slope  A  =  slope  B    H0:  intercept  A  =  intercept  B 


H; 


slope  A  f   slope  B    Ha:  intercept  A  f   intercept  B 


Covariance  analysis  would  be  the  statistical  method  employed  to  make  these 
comparisons. 

If  the  above  and  below  stations  were  established  prior  to  the 
treatment  and  a  paired  sample  data  base  developed  both  before  and  after  the 
treatment,  the  opportunity  exists  to  develop  a  paired-station  plot.  Such  a 
plot  for  suspended  solids  concentrations  at  Stations  A  and  B,  both  before 
and  after  treatment,  is  illustrated  in  Figure  3.  In  general,  these 
regressions  have  strong  correlation  coefficients  because  many  of  the 

16 


c 
.2 

-*- 

03 

-^- 

(0 
O) 

E 

to 

CO 


After  treatment 
Before  treatment 


SS  (mg/l)  at  Station  B 


Figure  3.  A  paired-station  plot  for  suspended  solids  concentration, 


background  variables  that  contribute  to  variance  in  the  data,  such  as 
climatic  and  hydrologic  variables,  have  been  normalized  at  both  stations. 
Consequently,  this  method  enables  us  to  make  a  better  assessment  of  the 
treatment  effects  than  any  of  the  methods  previously  described.  The  actual 
statistical  comparison  is  the  same  as  that  explained  for  the  regression 
curves. 

In  some  cases,  we  cannot  isolate  a  treatment  by  placing  stations  above 
and  below.  Such  an  instance  is  illustrated  in  Figure  4.  Here  the 
treatment,  which  could  be  a  vegetative  conversion  on  a  grazing  allotment, 
covers  an  entire  tributary  system.  There  are  two  approaches  to  locating 
sampling  stations  in  this  case.  The  first  is  to  simply  position  a  station 
immediately  below  the  treatment  (such  as  Station  A,  Figure  4),  and  another 
one  (such  as  Station  B,  Figure  4)  on  a  watershed  which  is  similar  to  the 
treated  watershed  in  all  respects  (that  is  climate,  geology,  soils, 
vegetation,  land  use,  etc.)  except  it  is  not  influenced  by  the  treatment. 

17 


Figure  4.  Sample  station  location  for  the  paired  watershed  approach. 


With  either  approach,  a  valid  assessment  of  the  treatment  effect  would 
require  sampling  both  before  and  after  the  treatment.  If  only  one  station 
is  established,  the  statistical  comparison  will  be  made  using  the  before 
and  after  means  or  regression  lines.  If  two  stations  are  established,  the 
comparisons  can  be  made  using  the  before  and  after  means  or  paired-station 
regressions.  The  paired  station  approach  is  recommended  over  the  single 
station  approach  because  it  allows  you  to  account  for  year-to-year 
variation  in  climate  and  hydrology. 

Compliance  monitoring  is  generally  performed  to  protect  public  health 
and  to  assure  that  waters  draining  from  National  Forest  System  lands  meet 
State  water  quality  standards.  In  general,  station  location  involves  the 
positioning  of  a  single  sampling  station  or  a  pair  of  stations.  Consider 
the  situation  where  the  drinking  water  in  a  campground  needs  to  be  tested. 

18 


In  general,  there  is  a  single  water  source,  such  as  a  well  or  stream,  from 
which  the  water  is  collected  and  distributed  through  lines  to  various 
locations  within  the  campground.  In  this  type  of  a  situation,  care  should 
be  taken  not  to  select  a  single  water  tap  and  designate  it  as  the  sampling 
station,  but  instead  each  time  a  sample  is  required,  select  any  one  of  the 
water  taps  at  random  (not  haphazardly)  and  then  collect  the  sample. 

In  the  case  of  a  swimming  beach,  such  as  that  illustrated  in  Figure  5, 
you  might  have  to  establish  several  sampling  stations.  Because  of  the 
shape  of  the  lake,  one  sampling  station  may  not  be  enough  to  provide  a 
representative  sample.  Consequently,  the  area  of  concern  may  have  to  be 
divided  into  homogeneous  strata,  each  of  which  is  sampled  separately.  This 
type  of  sampling  design  enables  you  to  make  a  direct  comparison  with  the 
standard  or  compare  the  sample  mean  with  the  standard. 

Sometimes  compliance  monitoring  requires  the  surveillance  of  point 
sources.  Consider,  for  example,  a  sewage  lagoon  which  treats  the  waste 
from  a  campground  and  whose  effluent  drains  into  a  perennial  stream  (Figure 
6).  There  are  two  approaches  to  locating  sampling  stations  in  this 
situation.  If  the  State  standards  require  the  effluent  to  be  of  a  fixed 
quality  or  better,  the  station  should  be  positioned  to  sample  the  effluent 
directly,  such  as  in  Case  I,  Figure  6.  If  the  State  standards  require  that 
the  effluent  not  increase  the  stream's  composite  load  by  a  certain 
difference,  such  as  temperature  by  2°C,  stations  would  have  to  be 
positioned  above  and  below  the  outfall  (Case  II,  Figure  6). 

Baseline  monitoring  is  designed  to  provide  information  on  water 
quality  trends.  In  general,  stations  are  positioned  strategically 
throughout  a  Forest  or  District  (such  as  at  the  mouths  of  major  streams  or 
confluences  of  major  tributaries)  to  obtain  trend  information  for  a  wide 

19 


•  =  Sampling  station 


Figure  5.  A  plane  view  of  a  sampling  station 
location  at  a  swimming  beach  along  a  lake. 


Case 


Case  II 


Figure  6.  Sampling  station  location  for  two  cases,  I  and  II, 
in  which  a  point  source  effluent  is  draining  into  a  stream. 


20 


range  of  conditions,  such  as  climate,  topography,  geology,  soils, 
vegetation  and  land  use. 

Inventory  monitoring  is  designed  to  characterize  the  water  quality  of 
a  Forest  on  a  broad  scale-  Sampling  stations  are  usually  located  on  major 
streams  at  or  near  the  Forest  boundary  or  at  other  strategic  locations 
within  the  Forest.  These  stations  are  often  positioned  so  that  they 
integrate  several  different  land  uses.  As  a  result,  the  quality  of  water 
at  these  stations  often  times  represents  the  cumulative  impacts  resulting 
from  multi -resource  management  activities  on  the  Forest. 

6.1.2  Station  Location  as  Influenced  by  the  Water  Type 

In  general,  there  are  three  types  of  water  bodies  of  concern  to  the 
forest  hydrologist:  (1)  streams,  (2)  lakes  and  reservoirs,  and  (3) 
groundwater.  The  establishment  of  sampling  stations  along  or  in  any   of 
these  water  bodies  is  directly  related  to  the  characteristics  that  control 
the  movement  of  water  and  distribution  of  water  quality  parameters  in  that 
water  body. 

There  are  several  factors  that  you  should  consider  when  you  are 
locating  sampling  stations  in  streams:  (1)  tributaries,  (2)  mixing 
characteristics,  (3)  suitability  for  discharge  measurements,  (4) 
accessibility,  and  (5)  suitability  for  biological  monitoring.  Tributaries 
should  always  be  considered  in  locating  sampling  stations  because  of  the 
effect  they  can  have  on  the  receiving  water.  The  question,  however,  is 
whether  or  not  a  specific  tributary  should  be  included  in  the  monitoring 
program.  In  general,  tributaries  involved  in  cause-and-effect  and 
compliance  monitoring  studies  should  be  monitored.  If  they  are  not 
included,  it  is  very  difficult  to  isolate  constituents  of  concern  and 

21 


minimize  variability.  An  example  of  station  location  for  a 
cause-and-effect  study  in  which  a  tributary  is  involved  is  presented  in 
Figure  7.  By  placing  sampling  stations  above  and  below  the  clearcuts 
(treatment  of  concern)  on  both  the  mainstem  and  tributary  allows  us  to 
assess  the  effect  of  logging  on  stream  quality  and  to  exclude  the  effects 
of  the  pasture  and  the  mountain  home  development. 

The  problem  lies  with  baseline,  inventory,  and  mixed  monitoring 
studies  where  large  areas  are  involved.  It  is  not  practical  to  include 
every  tributary  in  our  monitoring  network,  yet,  how  do  we  decide  which  ones 
to  include?  Ideally,  the  best  way  to  make  this  assessment  is  to  carry  out 
a  preliminary  reconnaissance  and  sample  all  the  tributaries  at  least  once. 

However,  most  of  the  time  this  is  not  possible  because  of  constraints 
in  manpower,  time,  and  money.  The  hydrologist,  therefore,  must  consider 
each  tributary  separately  and  develop  a  list  of  potential  tributaries  to 
sample.  Averett  (1976)  suggests  you  consider  the  following  guidelines  when 
performing  this  task. 

1.  Be  thoroughly  familiar  with  the  physical  characteristics  of  the 
system  you  are  studying.  Consider  such  things  as  drainage  area, 
geology,  soils,  vegetative  type  and  land  use.  A  large  variation 
of  any  of  these  factors  in  a  tributary  from  the  conditions  of  the 
mainstem  calls  for  the  tributary  to  be  included  in  the  sampling 
network. 

2.  Consider  the  dissolved  solids  concentration  or  the  electrical 
conductivity  of  the  tributary.  If  during  low  flow  periods 
electrical  conductivity  or  dissolved  solids  are  higher  or  lower 
when  compared  to  the  mainstem  flow,  then  you  have  strong  reason 
to  consider  monitoring  the  tributary. 

3.  Look  for  sediment  plumes  and  sand  and  gravel  bars  near  the  mouth 
of  tributaries.  The  presence  of  these  features  is  an  indicator 
of  erosion  upstream  and  is  reason  to  consider  monitoring  the 
tributary. 

4.  If  a  tributary  provides  a  proportionately  large  volume  of  flow  to 
the  mainstem,  you  should  consider  establishing  a  monitoring 
station  at  its  mouth.  An  upstream  tributary  may  be  small 
compared  to  the  downstream  mainstem.  However,  in  its  upstream 

22 


Mountain 
home  development 


Figure  7.  Example  of  sampling  station  location  for  a  cause-and-effect 
monitoring  study  in  which  a  tributary  is  involved.  Stations  A  and  C  lie  on 
the  mainstem  while  Station  B  is  on  the  tributary. 

location,  the  tributary  may  contribute  substantially  to  the 
mainstem  both  in  quantity  and  quality.  In  other  words,  you 
should  not  select  tributaries  for  sampling  based  upon  volume  of 
flow  alone,  but  rather  based  on  their  volume  relative  to  the 
mainstem  at  the  confluence. 

5.   If  a  tributary  is  of  sufficient  volume  and  different  water 

quality  to  influence  the  mainstem,  it  may  be  useful  to  establish 
some  stations  on  the  tributary  other  than  at  its  mouth. 

How  well -mixed  a  water  quality  constituent  is  in  a  stream  is  dependent 
upon  the  physical  and  chemical  nature  of  the  constituent  as  well  as  the 
physical  characteristics  of  the  stream.  The  physical  characteristics  of 
the  stream  which  affect  mixing  include  temperature,  depth,  velocity, 
turbulence,  slope,  changes  in  direction,  and  roughness  of  the  bottom. 

In  general,  if  the  sampling  point  of  interest  is  some  distance 
downstream  from  a  tributary  or  other  point  source,  such  as  a  sewage  outfall 
or  irrigation  return  flow,  the  water  quality  is  usually  fairly  well  mixed 
across  the  cross  section.  Most  sampling  problems  involve  mixing  below 
tributaries  and  other  point  sources.  Vertical  mixing  (from  surface  to 
bottom)  is  usually  quite  rapid  due  to  the  turbulence  of  mountain  streams. 
Lateral  mixing  (from  one  side  to  the  other),  on  the  other  hand,  may  not  be 

23 


complete  until  the  stream  has  passed  through  several  sharp  bends.  Consider 
the  example  presented  in  Figure  8.  The  water  from  the  tributary  "hugs"  the 
bank  until  the  first  bend  has  been  entered.  In  this  bend,  the  tendency  of 
the  water  is  to  continue  in  a  straight  line  and,  as  a  result,  mixing 
begins.  By  the  time  the  water  enters  the  third  bend,  the  lateral  mixing  is 
nearly  complete.  Consequently,  when  you  are  positioning  stations  below  a 
tributary  or  other  point  source,  make  sure  that  you  thoroughly  consider  the 
mixing  effects.  If  you  do  not,  your  sample  may  not  be  representative  of 
the  system. 


Figure  8.  An  illustration  of  lateral  mixing, 


When  establishing  sampling  stations  in  the  field,  it  is  important  that 
you  consider  the  suitability  of  each  station  for  discharge  measurements. 
Many  water  quality  studies  on  streams  have  been  of  little  use  because 
discharge  measurements  were  not  made  and  most  water  quality  constituents 
are  flow  dependent.  Without  discharge  measurements,  you  cannot  perform  a 
mass  balance  or  determine  mass  yield,  both  of  which  are  important  water 
quality  data  analysis  techniques. 

Another  concern  when  locating  stations  is  accessibility.  If  a 
sampling  station  is  located  a  substantial  distance  from  a  road,  make  sure 
time  and  manpower  costs  of  sampling  are   considered.  In  many  cases,  bridge 

24 


locations  are  selected  for  sampling  stations.  They  provide  ready  access  to 
the  entire  cross  section,  even  during  high  flows.  Bridges  are,   however, 
not  without  their  disadvantages.  Their  purpose  is  to  move  traffic  and,  as 
such,  may  not  be  positioned  properly  for  water  quality  monitoring  purposes. 
Bridges  may  influence  the  water  flow  and  quality  at  a  site. 

If  biological  sampling  is  to  be  involved  in  the  study,  you  should 
consider  the  physical  substrate  (boulders,  rubble,  sand,  and  mud),  velocity 
of  flow,  exposure  to  the  sun  and  the  width  and  depth  of  the  stream.  In 
general,  aquatic  biological  sampling  in  streams  involves  systematic 
resampling  of  (1)  a  transverse  or  longitudinal  transect  or  (2)  a  grid  or 
quadrant  system.  Transect  sampling  consists  of  collecting  samples  either 
along  a  section  of  stream  length  of  in  a  line  across  the  stream  (Figure  9). 
Samples  may  be  collected  at  uniform  intervals  along  the  transect  line  or  at 
random.  If  the  transect  line  is  along  the  stream  length  and  includes  pools 
and  riffles,  each  habitat  is  usually  considered  separately  and  sampled 
equally.  A  sampling  grid  or  quadrant  consists  of  an  imaginary  or  physical 
rectangular  arrangement  of  lines,  covering  all  or  part  of  a  given  habitat 
(Figure  9).  A  grid  or  quadrant  sampling  scheme  should,  as  with  the 
transect  scheme,  give  equal  consideration  to  the  various  habitat  types. 

When  locating  sampling  stations  in  a  lake  or  reservoir,  you  need  to 
consider  the  (1)  thermal  stratification,  (2)  circulation  of  the  water,  and 
(3)  morphology  of  the  basin.  Each  of  these  factors  strongly  influences  the 
spatial  distribution  of  the  water  quality  parameters  throughout  the  lake  or 
reservoir. 

In  temperate  regions,  lakes  and  reservoirs  deep  enough  to  stratify 
will  typically  develop  a  temperature  profile  similar  to  that  in  Figure  10. 
This  profile  consists  of  three  zones,  the  epilimnion,  the  metal  imnion,  and 

25 


Figure  9.  Examples  of  transect  and  grid  sampling  schemes. 

A  illustrates  longitudinal  and  transverse  transects  while 

B  illustrates  a  grid  of  nine  sampling  sites  (after  Averett,  1911), 


Temperature 


Q. 

CD 

Q 


Figure  10.  The  three  zones  of  a  temperature 
profile  in  a  stratified  lake. 


26 


the  hypolimnion,  each  defined  by  the  rate  of  change  in  temperature  with 
depth.  In  general,  the  epilimnion  is  a  fairly  wide  zone  consisting  of  warm 
water  which  has  a  moderate  temperature  gradient.  The  metal imnion  is 
commonly  a  narrow  zone  characterized  by  a  very  rapid  temperature  change  in 
depth.  The  hypolimnion  spans  from  the  base  of  the  metal imnion  to  the 
bottom  of  the  lake  or  reservoir  and  has  a  slight  to  moderate  temperature 
gradient.  Density  differences  of  the  water,  which  are  related  to  the 
temperature,  effectively  isolate  the  hypolimnion  from  the  zones  above 
except  for  particle  exchange  due  to  gravity  or  movement  of  fish.  If 
bacterial  respiration  is  excessive  in  the  hypolimnion,  which  is  usually  the 
case  when  the  water  body  is  in  a  eutrophic  or  enriched  state,  the  dissolved 
oxygen  can  be  depleted  and  anaerobic  conditions  may  develop.  If  this 
condition  occurs  the  dissolution  of  phosphorus,  iron,  manganese  and  other 
trace  metals  from  the  sediments  can  be  expected. 

The  epilimnion  and  metal imnion  are  warmer  than  the  hypolimnion  and  are 
the  zones  of  phytoplankton  production.  As  a  result,  the  water  quality  in 
these  zones  may  be  substantially  different  than  that  of  the  hypolimnion. 

The  point  to  remember  here  is  that  the  thermal  zones  in  a  lake  or 
reservoir  can  have  water  quality  quite  different  from  one  another.  When  a 
surface  site  is  selected  you  must  consider  the  thermal  zones  below  it  and 
make  certain  that  the  samples  you  obtain  are  ^resentative  of  the  system 
you  think  you  are  sampling.  In  many  studies,  you  will  find  it  necessary  to 
establish  several  sampling  stations  along  a  depth  profile  (Figure  11). 
Temperature,  dissolved  oxygen,  specific  conductance,  and  pH  are  very  useful 
measurements  to  make  when  deciding  where  to  locate  sampling  stations  along 
a  depth  profile. 


27 


V 


Epilimnion 


Metalimnion 


Hypolimnion 


•  =  Sampling  station 


Figure  11.  Illustration  of  sample  locations  along  the 
depth  profile  in  a  stratified  lake. 


Circulation  of  the  water  is  another  factor  that  you  need  to  consider 
when  locating  stations  in  lakes  and  reservoirs.  During  the  spring  and 
fall,  the  water  mass  overturns,  due  to  a  density  change  derived  from  the 
seasonal  cooling  or  warming,  and  the  water  obtains  a  uniform  temperature 
throughout  the  entire  depth  profile  (Figure  12).  At  this  time,  the  water 
quality  is  generally  uniform  throughout  the  depth  of  the  lake  and  a  single 
sample  collected  at  0.5  to  1.0  meters  depth  may  be  representative  of  the 
water  column. 

Wind  will  generally  cause  the  water  in  the  epilimnion  to  circulate  and 
facilitates  the  mixing  of  water  quality  constituents  throughout  this  zone 
(Figure  13).  In  the  case  of  a  circular  lake  where  wind  mixing  has 
occurred,  a  sample  collected  at  the  lake's  outlet  would  probably  be  as 
representative  of  the  water  quality  of  the  epilimnion  as  a  sample  collected 
at  the  center  of  this  zone. 

28 


Temperature 


a 
a 


Figure  12.  Temperature  profile  in  a  lake  or  reservoir  during  the 
period  of  overturn,  either  in  the  spring  or  fall. 


WiND 


«■»>■  *—>■ 


Figure  13.  An  illustration  of  the  effect  of  wind  on  the 
mixing  of  water  in  the  epi limn ion. 


If  the  morphology  of  a  lake  or  reservoir  is  irregular,  the  mixing 
patterns  of  the  epilimnion  by  the  wind  may  vary  substantially.  As  a 
result,  several  sampling  stations  may  be  required  to  characterize  the  water 
quality  of  the  lake.  For  example,  consider  the  lake  illustrated  in  Figure 
14.  Here  we  have  several  land  uses  located  around  a  lake  which  is 
irregularly  shaped.  The  area  around  the  recreational  home  development  is 
shaped  like  an  hour  glass  and  should  probably  have  each  "bulb"  sampled 

29 


Figure  14.  A  hypothetical  example  of  where  to  locate  sampling  stations 
to  monitor  surface  water  quality  on  a  multiple  use  lake. 


30 


separately.  The  island  isolates  a  cove  which  would  require  that  it  be 
sampled  separately.  The  other  coves  and  the  center  of  the  lake  may  or  may 
not  have  to  be  sampled,  depending  on  the  mixing  caused  by  the  wind.  The 
swimming  beach  area,  which  is  divided  by  a  peninsula,  would  require  at 
least  two  sampling  stations.  However,  the  actual  number  of  sampling 
locations  and  intensity  of  sampling  would  depend  upon  the  original 
objectives  of  the  monitoring  plan. 

When  locating  sampling  stations  in  lakes  and  reservoirs,  be  careful 
not  to  overlook  the  areas  of  sediment  deposition  (Averett,  1976).  These 
are  often  areas  of  potential  enrichment  and  may  have  a  substantial 
influence  on  the  water  quality  of  the  lake  or  reservoir  in  the  future  as 
well  as  give  insight  to  past  conditions  of  the  water  body.  You  may  need  to 
obtain  some  grab  samples  or  dredge  hauls  of  the  bottom  sediment  in  your 
sampling  program  to  delineate  these  areas.  You  also  may  wish  to  further 
delineate  your  stations  with  a  bathymetric  map  of  the  lake  or  reservoir  if 
one  is  not  available. 

Most  groundwater  quality  problems  confronting  the  forest  hydrologist 
involve  the  contamination  of  unconfined  or  water  table  aquifers  from  point 
sources,  such  as  solid  waste  disposals  or  leach  fields  below  sewage 
treatment  facilities.  When  locating  your  sampling  stations  for  this  type 
of  problem,  you  need  to  consider  the  soils  and  geology  of  the  area,  flow 
direction  of  the  ground  water  and  accessibility.  Consider  the  example 
illustrated  in  Figure  15  where  we  have  a  solid  waste  disposal  site. 
Precipitation  leaches  through  the  disposal,  picks  up  metals  and  other 
contaminants  and  transports  them  to  the  water  table.  The  soil  and  geology 
of  the  area  influence  the  rate  at  which  leachate  moves  toward  the  water 
table.  Depending  on  the  nature  of  the  contaminant,  the  soil  and  geology 

31 


CROSS-SECTION  VIEW 


Precipitation 


Water  table 


PLANE  VIEW 


Groundwater  flow 


Contamination  boundary 


=  Observation  well 


Figure  15.  Location  of  sampling  stations  around 
a  solid  waste  disposal  site. 


32 


may  act  as  a  filter  and  reduce  the  concentration  of  the  contaminant 
reaching  the  water  table.  If  a  clay  lens  is  present,  a  perched  water  table 
may  develop.  The  movement  of  the  ground  water  strongly  influences  where 
the  observation  wells  are  placed.  In  many  cases,  wells  are  simply  located 
above  and  below  the  source  to  quantify  the  effect  of  the  treatment.  In 
other  cases,  the  concern  might  lie  with  the  rate  and  extent  of 
contamination  which  would  require  a  more  extensive  monitoring  program 
(Figure  15).  Sometimes,  we  are  not  even  sure  which  way  the  ground  water 
flows  and  must  position  our  observation  wells  in  a  radial  pattern  around 
the  source  (Figure  16). 


•  =  Observation  well 


Figure  16.  Radial  design  of  observation  wells  around  a  point  source. 


33 


If  the  groundwater  problem  involves  a  confined  aquifer,  it  is 
important  that  you  obtain  knowledge  of  the  aquifer  in  question.  At  a 
minimum  this  should  include  the  areal  extent  of  the  aquifer,  its  width  and 
its  transmissibility.  Walton  (1970)  and  Freeze  and  Cherry  (1979)  present 
several  excellent  illustrative  examples  of  groundwater  monitoring. 

In  general,  access  is  limited  to  existing  wells  and  as  a  result,  we 
can  only  obtain  sketchy  information  about  the  system.  The  cost  of  drilling 
new  wells  is  usually  prohibitive.  However,  if  the  opportunity  arises  to 
establish  a  well  for  monitoring  purposes,  you  should  consult  a  geologist 
about  placement. 

6.2  Selecting  Mater  Quality  Constituents 

Every  water  quality  constituent  you  monitor  represents  an  investment 
in  time,  energy  and  money.  When  designing  your  water  quality  program  be 
sure  that  each  constituent  carries  its  own  weight  and  will  contribute  data 
that  help  solve  the  problem  or  question  at  hand. 

Table  2,  which  is  an  Activity  and  Concerns  -  Water  Quality  Matrix,  has 
been  developed  to  provide  you  with  some  guidelines  for  water  quality 
constituent  selection.  The  left  margin  of  the  table  consists  of  pertinent 
hydro! ogic  and  water  quality  constituents.  The  hydrologic  constituents 
have  been  included  because  measurement  of  water  flow  and/or  volume  is 
essential  for  most  water  quality  studies  and  it  is  important  that  they  are 
not  overlooked.  At  the  top  of  the  table  is  a  series  of  activities  and 
concerns.  This  series  of  activities  and  concerns  is  not  all  encompassing, 
but  does  include  the  major  ones  of  interest  to  the  forest  hydrologist. 
Each  activity  and  concern,  in  turn,  has  been  subdivided  by  water  type: 
stream  (S),  lake  or  reservoir  (L),  and  ground  water  (G).  For  each 

34 


Table    2.      Activities    and    Concerns    -   Uatur   Quality    Matrix 


ACTIVITIES    AND    CONCERNS 


CO 


O  O  H  H 

~*    h  en  2 

SIS 


STRE  AMI-LOW 

LAKE   WATER   LEVEL 

GROUND   HATER    LEVEL 


SUSPENDED   Sill.  IDS    (SS) 

BEDI.OAD    (111.) 

SEDIMENT    CORE 

TURBIDITY    (TURB) 

TEMPERATURE   (TEMP) 

HYDROGEN    ION   (pll) 

ELECTRICAL   CONDUCTIVITY    (EC) 

TOTAL    DISSOLVED    SOLIDS 

CALCIUM    (Co) 

MAGNESIUM    (Mg) 

SODIUM    (Ha) 

POTASSIUM    (k) 

BICARBONATE    (IICOj) 

SULFATE    (SO4) 

CARBONATE    (CO3) 

CHLORIDE    (CI) 

BORON    (B) 

IRON    (Fe) 

SELECTED  METALS 

ALKALINITY 

AMMONIA    (Nllj  +    Nll4  ) 

NITRATE    (NO3) 

TOTAL    NITROGEN    (TN) 

TOTAL   DISSOLVED    NITROGEN    (TDM) 

ORTIIOPHOSPIIATC    (  PO^  ) 

TOTAL    PHOSPHORUS    (  TP ) 

TOTAL    DISSOLVED    PHOSPHORUS    (TDP) 

TOTAL    ORCANIC    CARBON    (TOC) 

DISSOLVED  OXYGEN    (DO) 

BIOCHEMICAL    OXYGEN    DEMAND    (BOD) 

HERBICIDES 

INSECTICIDES 

PETROLEUM   IIYDROCARBuNS 

RADIONUCLIDES 


TOTAL 
FECAL 
FECAL 
MACRO  I 
ALGAE 


COLIFORM  (TC) 
COL1FORH  (FC) 
STREPTOCOCCI    (FS) 

((VERTEBRATES 


S   L  C 


J/ 


3 
I 
1 

I    £ 
1    I 


1 

3  2 

2  2    1 

I  1   2 

1  I    2 

2  2   3 
1  1    2 

1  1   2 

2  1 
1  I 

3  3 


3   3 
2 


1  1 
2 

3  3 

1  1 

1  I 

3  I 

I  1 

3  3 


2 
3  1 
3  1 

2 
3  1 
3   1 

2 
2  I 
3 


1    1 

2 
3   3 

1    1 
1 

I 
3    3 


Z'-k 


2  2 
2 

2  2 

2  2 


I 

I    2 
1    1 


3  3  3 
1  3  3 
3  3  3 


1  I 

2  2 
2  2 
I  I 

1  1 

2  2 
I  1 

1  1 

2  1    3 

3  1 


id   ri 


I  1 

2 

1  1 

3  3 
1 

2  1 

2  1 

3  2 


as 

u  u 

31     -J 

H   a, 

S3 


1  1 
2 

1  1 

3  3 

1  1 

2  1  2 

2  1  2 

3  2  3 


i 

2 

2  2 
1  2 
1  2 
2 
1  2 


3    3 
2   2 


II 


1    I 
3 


2  2 
1  I 
1   1 


1  1 
2 

2  2 

I    1 

1    1 

1 


1 


1  1 

2  2 


3 

2 

2 
2 
2 
2   1    I 

2  1    1 

3  1 
2  1 
2   3 


2  2 
1  I 
1    1 

3  3 
3 


1  1. 
2 

2  2 
1  I 


1 

3  2 

2  2   2 

I  1    1 

1  1    1 

2  2   2 
1  1    I 

1  I    1 

3  1 

2  1 

3  3 


2  2 
1  1 
1    1 


b/n„... 


"I."  denotes 

tfe     trcilmcMi 


us  aud   reservoirs,   and   "0"   denote*   groundwater. 


'Primary    sampling    Co 
'Tertiary    sampling    r 

'Secondary    sampling 


1  1 

1  1 

1  1    1 

1  1    1 

3  3   2 


2   2    2 
2   2    2 


I  1 

1  1 

2  2 
1  1 

1  I 

2  I 
1  1 
1  1 


1  1  1 

I  1  I 

1  1  1 

I  3 


aj 

a 

So 

z 

=!S 
S  L  G 


1  1 

2 

2  2 
I    I 

I   i    1 
i    \    1 

1  1   1 

2  2  2 
3 

3 
3 
3 
3 
3 
3 
3 


2  2  2 
2  2  2 
L    1 


O    -■ 

cj  at 

5  <7i 


I    I 

3 


3  3 

1    I 
3  I   1 

I 
I    1    1 
3  3  3 


2   2   2 
1    1 


2  2 

2  2  2 

I  1  I 

1  I  I 

2  2  2 
I  1  1 

1  I  I 

2  1  2 
1  I 

3  3 


I    I     I 

I.    i    I 
3    3 
2 


I    1 
I    I 

I   I 


I    I 
1   1 


1  1 

2  2 

3  3 
3  3 
3  3 
I  I 
I  I 
3  3 
3  3 
3  3 


1  1 
3  3 
3  3 

2  2 
1  2 


1  1 
3 

3  3 

1  I 

L  I 

I  I 

1  1 


3  3   3 

1  1    1 
3  3 

2  2 
1  1 

1  1 

2  2 
1  1 

1  1 

2  I 
1  1 

3  3 


2  2  2 
1  i  1 
1  1    1 

3  3 
3  3 


is 


1 1 1 

1 

2 1 2 
1 1 1 
333 


1 1 1 
1 1 1 
1 1 1 


combination  of  activity  or  concern,  water  quality  type,  and  constituent, 
there  is  one  of  four  priority  codes:  1,  2,  3  or  blank.  A  primary  code,  1, 
suggests  that  it  is  very  important  that  the  constituent  be  monitored. 
Sampling  these  constituents  will  provide  information  which  is  necessary  to 
meet  study  objectives.  A  secondary  code,  2,  suggests  that  it  is  important 
that  a  constituent  be  monitored,  however,  if  funds  are  restricted,  these 
constituents  should  be  considered  a  lower  priority  than  those  coded  by  a  1. 
These  constituents  usually  supply  supporting  information  which  address  the 
study  objectives.  A  tertiary  code,  3,  means  that  this  constituent  probably 
will  contribute  little  direct  information  to  the  study  objectives,  but  may 
be  useful  for  other  purposes.  A  blank  suggests  there  is  no  need  to  monitor 
the  constituent. 

Please  keep  in  mind  that  these  priority  codes  are  presented  only  as 
guidelines.  The  specific  needs  and  objectives  of  your  study  objectives  of 
your  study  may  require  more  emphasis  be  placed  on  certain  constituents  and 

less  on  others. 

For  individuals  interested  in  a  review  of  the  various  water  quality 
constituents,  their  significance  to  beneficial  uses  and  land  use-water 
quality  interactions,  the  following  literature  is  suggested:  Brown  (1972), 
U.S.  EPA  (1977,  1976a,  1976b,  1973  and  1971),  U.S.  Forest  Service  (in 
press),  Greeson,  et  al  (1977),  Guy  (1970),  Hem  (1970),  Krygier  and  Hall 
(1971),  McKee  and  Wolf  (1963),  McNeely,  Neimans  and  Dwyer  (1979),  and 
Thatcher,  Janzer  and  Edwards  (1977). 

6.3  Guidelines  for  Determining  Sampling  Frequency 

The  frequency  of  sample  collection  should  be  designed  to  provide  the 
data  necessary  to  (1)  calculate  an  estimate  of  a  specific  population 

36 


parameter,  such  as  the  mean,  and/or  (2)  develop  a  regression  relationship. 
In  either  case,  we  want  our  parameter  and  regression  estimators  to  fall 
within  some  pre-established  bound  of  reliability.  As  a  result,  sampling 
frequency  should  be  directly  related  to  the  variance  of  the  water  quality 
constituent  of  concern.  In  other  words,  the  more  variable  a  constituent  is 
in  time  and  space,  the  more  frequently  it  must  be  sampled  to  achieve  a 
given  level  of  reliability. 

In  this  subsection,  guidelines  for  determining  sampling  frequency  for 
several  different  sampling  methods  are   presented.  It  should  be  noted  that 
emphasis  has  been  placed  on  application  of  the  methods  opposed  to  the 
intricacies  of  the  underlying  statistical  theory.  For  a  more  detailed 
discussion  of  each  method,  including  the  underlying  theory,  two  references 
are  suggested:  Mendenhall ,  Ott,  and  Schaeffer  (1971)  and  Cochran  (1963). 
Much  of  what  follows  in  this  subsection  has  been  taken  from  Freese  (1962), 
with  minor  modifications. 

6.3.1  Systematic  Sampling 

Systematic  sampling  is  easily  carried  out  and  under  some  circumstances 
is  a  useful  method.  It  consists  of  randomly  selecting  the  first  time  of 
sampling  and  then  selecting  the  remaining  samples  at  some  pre-determined 
interval,  such  as  weekly,  biweekly  or  monthly.  While  this  simple  method 
can  be  easily  used  in  most  water  quality  studies,  it  has  serious 
limitations  in  that  the  data  may_  be  biased.  If  the  data  are  biased,  the 
statistical  analysis  may  lead  to  erroneous  inferences  about  the  water  body 
being  examined. 


37 


6.3.2  Simple  Random  Sampling 

The  fundamental  principle  in  simple  random  sampling  is  that,  in 
choosing  a  sample  of  "n"  observations,  every  possible  combination  of  "n" 
observations  should  have  an  equal  chance  of  being  selected.  For  example, 
if  you  plan  on  collecting  25  daily  samples  over  a  period  of  one  year,  you 
must  choose  the  25  days  of  sample  collection  in  a  random  manner. 

The  question  of  interest  here  is,  How  do  we  determine  "n"?  More  often 
than  not,  "n"  has  been  arbitrarily  selected  by  a  sampler  basing  the 
decision  of  what  "looks  right."  Fortunately,  a  simple,  objective  procedure 
exists  for  determining  "n"  when  using  the  simple  random  sampling  method. 
The  procedure  is  based  on  the  level  of  risk  the  sampler  is  willing  to  take 
when  estimating  the  mean.  The  level  of  risk,  in  turn,  is  directly  related 
to  the  beneficial  use  of  water.  Obviously,  if  you  are  dealing  with  a 
drinking  water  supply  you  would  be  more  concerned  with  the  accuracy  of 
your  estimate  than  if  you  were  dealing  with  a  stock  watering  tank. 

In  planning  a  water  quality  survey,  we  might  state  that  unless  the 
l-in-20  chance  (a=  0.05)  occurs,  we  would  like  our  sample  estimate  of  the 
mean  to  be  within  some  specified  error  range  of  the  population  mean  such  as 
_+  E  mg/1.  Since  the  small  sample  confidence  limits  are  computed  as 

X  =  ±  t0Sx  (1) 

where  X  is  the  mean,  t  denotes  the  Student's  t  value  for  a  specified  a  and 
sx  is  the  standard  error  of  the  mean,  this  is  equivalent  to  stating  that 
we  want  E  =  Us*  (2) 

For  a  simple  random  sample  the  standard  error  of  the  mean  can  be  determined 

by  s,«^Z  (i-S)  (3) 

where  s2  is  the  sample  variance,  "n"  the  number  of  units  sampled  and  N  is 
the  total  number  of  units  in  the  population.  Substituting  equation  (3) 

38 


into  equation   (2)  and  solving  for  "n"  yields  equation   (4) 

1 


n  = 


E2     +1 


t  2  s|      N  (4) 


To  determine  "n",  we  must  have  some  estimate  of  the  population  variance, 
s2.  Sometimes  the  information  is  available  from  previous  surveys.  In 
the  absence  of  this  information,  a  small  preliminary  survey  might  be  made 
in  order  to  obtain  an  estimate  of  the  variance.  When,  as  often  happens, 
neither  of  these  solutions  is  feasible,  a  very  crude  estimate  can  be  made 
using  equation  (5)  where  R  is  the  estimated  range  from  the  smallest  to  the 


/FT2 


;',4/  (5) 

largest  concentration  (mass)  likely  to  be  encountered  in  sampling.  This 
approximation  procedure  should  be  used  only  when  no  other  estimate  of  the 
variance  is  available  and  the  observations  are  approximately  normally 
distributed. 

Having  specified  a  value  of  E  and  obtained  an  estimate  of  the 
variance,  the  last  piece  of  information  required  is  the  value  of  t.  Here 
we  hit  a  circular  problem.  To  use  t  we  must  know  the  number  of  degrees  of 
freedom.  However,  the  number  of  degrees  of  freedom  is  "n-1"  and  "n"  is  not 
known  and  cannot  be  determined  without  knowing  t. 

An  iterative  approach  can  be  used  to  solve  this  problem.  The 
procedure  is  to  guess  at  a  value  of  "n,"  use  the  guessed  value  to  get  the 
degrees  of  freedom  for  t  and  then  substitute  the  appropriate  t  value  into 
the  sample-size  formula  (equation  4)  and  solve  for  a  first  approximation  of 
n.  Selecting  a  new  "n"  somewhere  between  the  guessed  value  and  the  first 
approximation,  but  closer  to  the  latter,  we  compute  a  second  approximation. 
The  procedure  is  repeated  until  successive  values  of  "n"  are  nearly  the 
same;  usually  three  trials  will  suffice. 

39 


If  the  sampling  fraction  is  likely  to  be  small  ("fj"  <  0.05) 

n 
the  term  l-fj  of  the  standard  error  formula  (3)  can  be  ignored  and 

the  sample  size  formula  (4)  simplifies  to 

n  =  ^JL  (6) 

Examples  2a  and  2b  illustrate  the  estimation  of  sample  size  for  the 

simple  random  sampling  method. 


40 


Example  2a 
Estimating  Sample  Size  for  the  Simple  Random  Sampling  Method 


Problem: 

Blue  Spruce  Reservoir,  which  is  underlain  by  gypsum  bearing  rock 
formations,  drains  into  Camp  Creek.  There  is  some  concern  by  downstream 
users  that  the  sulfate  concentrations  are  excessively  high.  The  Forest 
Supervisor  would  like  an  estimate,  within  15  mg/1 ,  of  the  mean  annual  SO4 
concentration  passing  the  stream  gage  immediately  below  the  outlet  spillway 
with  a  fairly  high  degree  of  reliability  (a=  0*05).  There  is  little 
fluctuation  in  the  discharge  from  the  dam,  therefore,  simple  random 
sampling  can  be  applied.  Assume  the  SO4  concentration  varies  between  20 
and  100  mg/1  during  the  year.  Estimate  the  necessary  sample  size,  n. 

Solution: 

If  the  sample  size  is  less  than  18,  then  we  may  use  the  simplified 
formula  since  18/365  =  0.049  <  0.05. 


n  = 


t,2s2 


We  know  from  the  problem  that  E  =  15  mg/1,  a=   0.05  and  R  =  80  mg/1.  The 
variance  can  be  estimated  as  follows. 

s2  =  /"Y  =  fl&Y  =  400 


4/   \4 


To  determine  t  we  can  use  as  a  first  approximation  n  =     18  which  yields  17 
d.f.  and  t. 05(17)  =  2.110   (See  Appendix  Table  A,  Values  of  t).     The  first 


estimate  on  n  can  now  be  calculated. 


n  = 


t^2 
E2 

i\2 


(2.11  Of  (400) 
n  =  J 


152 


n  =  7.91 


The  correct  solution  is  somewhere  between  7.91  and  18,  but  much  closer  to 
7.91.  For  our  second  trial  we  select  n  =  8.  The  value  of  t  now  becomes 
2.365. 

_  (2.365)2  400 
(15)2 

n  =  9.94 


41 


We  now  know  the  correct  solution  lies  between  8  and  9.94.  Repeated  trials 
will  give  values  between  9.1  and  9.94.  Since  the  sample  size,  n,  must  be 
an  integral  value  and,  because  9  is  too  small,  a  sample  of  n  =  10 
observations  would  be  required  for  the  desired  precision. 


42 


Example  2b 
Estimating  Sample  Size  for  Simple  Random  Sampling 


Problem: 


A  preliminary  sample  (10  observations)  of  electrical  conductivity  in 
the  epilimnion  of  Elk  Lake  yielded  the  following  statistics. 

X  =  187     s  -  35 

What  sample  size  would  be  required  to  estimate  the  mean  EC  in  the 
epilimnion  of  Elk  Lake  within  plus  or  minus  10  percent,  with  a  l-in-20 
chance  of  being  wrong  in  the  conclusion  that  y^u  have  done  so*  Assume 
simple  random  sampling  is  to  be  employed  and  «ris  less  than  0.05= 

Solution: 

The  confidence  limits  on  the  mean  are  given  by 


Therefore: 

35 

V"n 


187  ±  t.,,5 


The  95  percent  confidence  limits  of  plus  or  minus  10  percent  of  the 
mean  gives 

35 

\f~r\ 


Solving  for  "n"  yields 


n  = 


_  t052  (35)2 


(18.7)2 


For  our  first  trial  we  select  n  =  25  which  gives  us  24  d.f.;  therefore 
tj05(24)  "  2-064- 


_  (2.064)2  (35)2 
"  "    (18.7)2 


n  =  14.9 

43 


We  know  the  correct  solution  lies  between  14.9  and  25,  but  closer  to  14.9. 
For  our  second  trial  n  is  set  at  16. 

(2.131)2(35)2 
(18.7)2 

n  =  15.9 


From  repeated  trials  we  find  little  difference  in  the  calculated  n, 
therefore  we  select  16  as  the  sample  size. 


44 


In  some  cases  you  may  want  to  determine  your  sample  size  based  on  a 
pre-established  estimate  of  the  magnitude  of  change  (difference)  in  the 
concentration  or  mass  of  a  water  quality  constituent  between  paired 
stations.  As  with  other  procedures  used  to  estimate  sample  size  when 
simple  random  sampling  is  employed,  this  method  is  also  based  on  a  good 
estimate  of  the  sample  variance.  The  method  outlined  below  is  discussed  in 
detail  by  Snedecor  and  Cochran  (1967)  and  has  been  summarized  by  Potyondy 
(1977). 

The  procedure  requires  you  to  select  a  value,  d,  which  represents  the 
size  of  difference  between  the  paired  stations  that  is  regarded  as 
important.  If  the  difference  is  as  large  as  ds  we  would  like  the 
monitoring  program  to  have  a  high  probability  (probabilities  of  0.80  and 
0.90  are  common)  of  showing  a  statistically  significant  difference  between 
the  paired  stations.  In  statistical  jargon,  the  calculation  allows  the 
selection  of  the  confidence  level  of  the  test  (1  -a)  as  well  as  the  power 
of  the  test  (1-3)  and  combines  these  two  elements  in  determination  of  the 
sample  size. 

The  following  example  taken  from  Potyondy  (1977)  is  used  to  illustrate 
the  mechanics  of  this  procedure.  Consider  the  following  sample  statistics 
from  a  set  of  turbidity  data  collected  on  the  East  Fork  Smiths  Fork 
Barometer  Watershed  in  Utah  and  Wyoming:  X  ■  4=5  JTU;  s  =  2.83.  (It 
should  be  noted  that  an  underlying  assumption  of  this  procedure  is  that  the 
data  are  normally  distributed.)  The  standard  deviation,  s,  can  be 
expressed  as  a  percent  of  the  mean,  referred  to  as  the  coefficient  of 
variation,  CV.  Therefore: 

CV  =  (s/X)100  =  (2.83/4.5)100  -  63%  (7) 


45 


The  standard  deviation  of  the  difference,  s^,  is  estimated  as: 

sd  =  2vCV  =  2v{63~)  =  89%  (8) 

Suppose  we  wish  to  detect  a  difference  of  5  JTU's  between  the  paired 
stations  of  interest.  Expressed  as  a  percent  of  the  mean,  the  difference 
to  be  detected,  d,  is  determined  as  follows: 

d  =  (5.0/4.5)100  =  111%  (9) 

Assume  that  we  want  to  be  90  percent  certain  of  showing  a  statistically 
significant  difference  between  means  in  a  two-tailed  t-test  at  the  o=  0.05 
level  of  significance. 

The  following  formulas  apply: 

ni  =  (sd/d2>  M(l-6,cO  (1Q) 

where  M(o.90,0.05)  1S  a  multiplier  from  Table  3  which  is  equal  to  10.5. 
Substituting  and  solving  for  n-j  yields: 

ni  =  (892/lll2)(10.5)  =  6.75 
which  is  rounded  up  to  the  next  highest  integer 

ni  =  7 
Degrees  of  freedom,  v,  are  determined  as  follows: 

v=  2ni  -  2  =  (2)(7)  -  2  =  12  (11) 

The  required  sample  size,  n,  can  now  be  determined. 

Sample  size  =  n  =  (v+  3)  n^/(v+  1)  =  (15)(7)/(13)  =  8.08    (12) 
The  sample  size  to  use  is  rounded  to  8. 

2  9 

Table  3.  Multiplier  (M)  of  (s./cK)  to  be  used  in  paired  comparitive 

sample  size  calculations  (after  Potyondy,  1977). 


(i  -  e) 

Two-tailed  Tests 

a.  level 
0.01     0.05     0.10 

One-tailed  Tests 

a  level 
0.01     0.05     0.10 

0.80 

0.90 

.95 

11.7  7.9       6.2 
14.9     10.5       8.6 

17.8  13.0     10.8 

10.0       6.2       4.5 
13.0       8.6       6.6 
15.8     10.8       8.6 

46 


Although  simple  random  sampling  has  its  place  in  water  quality 
monitorings  it  is  limited  because  the  watershed  system  under  investigation 
is  too  variable  with  regard  to  its  component  parts*  Fortunately  the 
component  parts  of  most  watershed  systems  vary  within  a  definite  and 
repeated  pattern  and  their  variability  can  be  reduced  and  better  understood 
using  stratified  random  sampling  methods  (Averett,  1976). 

6.3.3  Stratified  Random  Sampling 

Stratified  random  sampling  is  a  commonly  used  sampling  method  in  water 
quality  studies.  This  method  allows  the  hydrologist  to  take  advantage  of 
prior  knowledge  concerning  the  mechanisms  and  processes  controlling  the 
water  quality  in  a  watershed  system.  In  stratified  random  sampling,  the 
units  of  the  population  are  grouped  together  on  the  basis  of  similarity  of 
some  characteristic,  such  as  flow  regime  (that  is  baseflow,  stormflow, 
snowmelt  runoff,  etc.)  or  temperature  in  a  lake,  such  as  the  epilimnion  and 
the  hypolimnion.  Each  group  or  stratum  is  then  sampled  and  the  stratum 
estimates  are  combined  to  give  a  population  estimate. 

Stratified  random  sampling  offers  two  primary  advantages  over  simple 
random  sampling.  First,  it  provides  separate  estimates  of  the  mean  and 
variance  of  each  stratum.  Second,  for  a  given  sampling  intensity,  it 
generally  gives  more  precise  estimates  of  the  population  parameters  than 
would  a  simple  random  sample  of  the  same  size.  For  this  latter  advantage, 
however,  it  is  necessary  that  the  strata  be  established  so  that  the 
variability  among  sample  values  within  the  strata  is  less  than  the 
variability  in  the  population  as  a  whole. 

Some  drawbacks  of  stratified  random  sampling  are  that:  (1)  each  unit 
in  the  population  must  be  assigned  to  one  and  only  one  stratum;  (2)  the 

47 


size  of  each  stratum  must  be  known;  and  (3)  a  simple  random  sample  must  be 
taken  in  each  stratum.  The  most  common  barrier  to  the  use  of  stratified 
random  sampling  is  lack  of  knowledge  of  the  strata  sizes. 

To  illustrate  the  computational  procedures  required  to  determine  the 
mean  and  its  confidence  limits  from  a  stratified  random  sample  consider  the 
electrical  conductivity  data  tabulated  in  Table  4.  The  flow  regime  was 
divided  into  three  periods  (strata):  (1)  winter  baseflow  (November  1/ 
April  15);  (2)  snowmelt  runoff  (April  16/Jaly  15);  and  (3)  summer  runoff 
(July  16/0ctober  30).  Grab  samples  were  collected  ten  times  during  winter 
baseflow,  25  times  during  snowmelt  runoff  and  15  times  during  summer 
runoff.  Only  one  sample  was  collected  per  day  and  each  sample  day  was 
selected  at  random. 


Table  4.  Electrical  conductivity  data  (umhos/cm)  collected  from  a  Rocky 
Mountain  stream. 


Stratum 


I.  Winter  Baseflow 


Observations 

110 

112 

100 

119 

105 

113 

115 

106 

107 

100 

89 

73 

51 

41 

57 

72 

54 

43 

47 

69 

43 

50 

49 

51 

77 

51 

62 

68 

63 

81 

68 

74 

39 

48 

85 

Total  -  1087 
X  =  108.7 
s  =     6.25 


II.  Snowmelt  Runoff 

Total  =  1505 
X  =  60.2 
s  =  14.6 


III.  Summer  Runoff  156  172  191 

145  164  210 

Total  =  2476  129  178  139 

X  =  165.1  187  154  145 

s  =  21.78  159  167  180 


48 


The  mean  EC  of  the  stratified  sample  is  computed  by  the  general 
equation 


NhXh 


XtS  ~ 


N 


(13) 


Where  Xj$  is  the  mean  of  the  stratified  sample,  L  the  number  of  strata, 
Nh  is  the  total  size  (number  of  possible  observations)  of  stratum  h,  and 
N  is  the  total  number  of  observations  in  all  strata.  Using  the  data 
presented  in  Table  2,  the  mean  can  be  calculated  as  follows: 

L  =  3 

Nj  =  166  Xi=  108.7 

Nil  =  91  Xii  =  60.2 

Nm  =  108  Xin  ■  165.1 

N  =  365 

==    166(108.7)  +  91(60.2)  +  108(165.1) 
toTS  =  


365 
ECTS  =  113  y.  mhos/cm 

The  mean  EC  computed  here  is  basically  a  time  weighted  average  which  is  the 
average  daily  EC  of  the  water  passing  the  point  of  measurement. 

The  standard  error  of  the  mean  of  a  stratified  random  sample  is 
calculated  by  the  general  equation 


Syt  — 


N2^ 


N„V 


1-£ 

N 


(14) 


where  n^  is  the  number  of  observations  in  stratum  h,  s^  is  the 
variance  of  sample  from  stratum  h  and  the  other  terms  are  as  previously 

49 


defined.  If  the  sampling  fraction  within  a  particular  stratum  (n^/Nh) 
is  small  (that  is  less  than  0.05),  the  term  (l-n^/Nh)  can  be  omitted 
for  that  particular  stratum  when  calculating  the  standard  error  of  the 
mean.  For  the  electrical  conductivity  example  the  standard  error  can  be 
calculated  as  follows: 


• 


=   r±_ 

V(365)2 


SxT 


(166)2(6.25)2  /  W_\       (91  f  (14-6)2   /        25\       (108n21.78r   L       J5 

10  V        166/  25  V       91/  15  \         108/J 

Sxt  =  1-88 


A  rough  estimate  of  the  95%  confidence  interval  about  the  mean  can  be 

obtained  using  equation  (15). 

XSt  ±  2(sXT).  (15) 

For  our  electrical  conductivity  example,  the  confidence  interval  would 
range  from  109  to  117  umhos/cm. 

Before  an  estimate  of  the  total  sample  size  can  be  made,  the 
hydrologist  must  select  the  method  of  sample  allocation.  Basically,  there 
are  two  methods  of  sample  allocation:  proportional  and  optimal.  In  the 
proportional  allocation  procedure,  the  proportion  of  the  sample  that  is 
selected  in  the  hth  stratum  is  made  equal  to  the  proportion  of  all  units 
in  the  population  which  fall  in  that  stratum.  If  a  stratum  contains  half 
of  the  units  in  the  population,  half  of  the  samples  would  be  collected  in 
that  stratum.  In  equation  form,  if  the  total  number  of  sample  units  is  to 
be  "n,"  then  for  proportional  allocation  the  number  to  be  observed  in 

stratum  "h"  is 

n*  =  (iT)n  (16) 

In  optimum  allocation  the  observations  are  allocated  to  the  strata  so 

as  to  give  the  smallest  standard  error  possible  with  a  total  of  "n" 

50 


observations*  For  a  sample  size  "n,'!  the  optimum  allocation  is 

2>hshj  (17) 

The  best  way  to  allocate  a  sample  among  the  various  strata  depends  on 
the  study  objectives  and  our  information  about  the  population*  The  optimum 
allocation  is  preferable  if  the  objective  is  to  get  the  most  precise 
estimate  of  the  population  mean  for  a  given  cost.  If  we  want  separate 
estimates  for  each  stratum  and  the  overall  estimate  is  of  secondary 
importance,  we  may  want  to  sample  heavily  in  the  strata  having  high-value 
information.  Then  we  would  ignore  both  optimum  and  proportional  allocation 
and  place  our  observations  so  as  to  give  the  degree  of  precision  desired 
for  the  particular  strata. 

The  procedure  for  estimating  the  total  size  of  sample  (n)  needed  in 
stratified  random  sample  can  now  be  addressed.  Basically  three  pieces  of 

information  are  required: 

2 

(1)  a  reasonably  good  estimate  of  the  variance  (Su)  or  standard 

deviation  (sn)  among  individuals  within  each  stratum. 

(2)  the  method  of  sample  allocation. 

(3)  a  statement  of  the  desired  size  of  the  standard  error  of  mean, 
symbolized  by  D. 

Some  preliminary  sampling  is  generally  required  to  determine  the 
desired  size  of  the  standard  error  of  the  mean.  The  estimate  of  D  in  the 
sample  size  equations  is  generally  taken  to  be  some  portion,  such  as 
two-thirds  or  one-half,  of  the  standard  error  calculated  from  the 
preliminary  sample. 

51 


Given  this  hard-to-obtain  information,  the  stratified  random  sample 
size  can  be  estimated  by  the  following  equations. 
For  proportional    allocation: 


n  = 


t„2N  j[  Nhs,2 

L 

N2D2  +  t02  ]T  Nhsh 


(18) 


For  optimum  allocation: 

t.2(ZN"Sh)  (19) 


n  = 


N2D2  +  tff2^Nhs 


h-  l 


The  value  "2"  is  commonly  used  as  an  estimate  of  the  Student's  t 
value.  When  sampling  fractions  (nn/Nn)  are  likely  to  be  very  small  for 
all  strata,  the  second  term  of  the  denominators  of  the  above  equations 
may  be  omitted  leaving  only  N^D^. 

If  the  optimum  allocation  formula  indicates  a  sample  (nn)  greater 
than  the  total  number  of  units  (Nn)  in  a  particular  stratum,  nh  is 
usually  made  equal  to  Nn.  The  previously  estimated  sample  size  (n) 
should  then  be  dropped,  and  the  total  sample  size  and  allocation  for  the 
remaining  strata  recomputed  omitting  the  Nn  and  sn  values  for  the 
offending  stratum,  but  leaving  N  and  D  unchanged. 

Example  3  illustrates  how  to  estimate  the  sample  size  for  a 
stratified  random  sample. 


52 


Example  3 
Estimating  Sample  Size  for  a  Stratified  Random  Sample 


Problem: 

The  mean  daily  electrical  conductivity  is  to  be  determined  at  the 
mouth  of  Cabin  Creek  which  is  located  in  the  northern  Colorado  Rockies. 
Estimate  the  sample  size  that  would  be  required  and  distribute  the  samples 
over  a  one  year  period. 

Solution: 

The  flow  regime  can  be  divided  into  three  periods  (strata):  winter 

baseflow  (November  1/April  15);  snowmelt  runoff  (April  16/July  15);  and 

summer  runoff  (July  1/  October  30).  Data  collected  on  a  nearby  stream 
provided  information  about  the  variance. 


Stratum  (h) 

1  (WB) 

2  (SM) 

3  (SRO) 


Nh 

166 

91 

108 


sn 

8 
24 
41 


An  estimate  of  the  standard  error  of  the  mean,  sx>  was  made  from  past 
data. 

sx  =  5.05 

The  desired  D  is  set  equal  to  one-half  of  sj(.  Therefore,  D  =  2.53.  In 
addition,  the  optimal  allocation  method  is  selected  to  allocate  the 
samples. 

The  sample  size,  n,  can  now  be  determined  using  the  optimal  allocation 
method. 


n  = 


(2)MJNhS, 


1  + 


(2)2  J  Nh 


N2D2 


N2D2 


296 
2.15 


=  138 


The  determined  n  is  the  sample  size  necessary  to  estimate  the  sample  mean 
with  a  standard  error  of  2.53.  However,  because  of  budgetary  constraints, 
it  may  not  be  possible  to  sample  the  stream  138  times.  If  that  is  the 
case,  then  we  would  have  to  lower  the  reliability  constraint  on  the 
estimate  of  the  mean.  If  we  set  D  =  sx  the  required  sample  size  becomes 


n 


58, 


53 


In  this  hypothetical  problem  assume  that  n  =  58  is  accepted.  The  next 
step  is  to  allocate  the  sample  by  strata.  This  is  achieved  as  follows 
[from  equation  (19)]. 

Strata  1. 

,  •  ,   ,  (166)(8)(58)=1Q 

(winter)  7940 

Strata  2. 

(snowmelt  runoff)        n2  -  (91)  (24)  (58)  =  1g 

7940 

Strata  3. 

(summer)  n  _  (108)  (41)  (58)  ,  33 

7940 

At  this  point  you  should  look  at  the  allocation  and  ask  yourself  if  it 
looks  right.  In  this  case,  most  of  the  samples  are   allocated  to  the  summer 
runoff  period.  This  is  the  period  of  greatest  variation  in  the  water 
quality  and,  hence,  the  period  that  should  be  sampled  most  intensely.  On 
the  other  hand,  the  water  quality  is  fairly  stable  during  baseflow  and 
requires  the  least  amount  of  sampling.  Snowmelt  varies  twice  as  much  as 
baseflow  but  occurs  over  a  period  equal  to  two-thirds  of  the  period  for 
baseflow.  As  a  result,  the  sampling  of  snowmelt  looks  about  right.  It  is 
decided  that  the  allocation  is  acceptable. 


54 


7.0  Guidelines  for  Collecting  and  Handling  of  Water  Quality  Samples 

Obtaining  representative  samples  and  then  maintaining  the  integrity  of 
the  constituents  is  an  integral  part  of  any  wildland  water  quality  program. 
If  the  samples  are  not  collected  and  handled  properly  the  data  will  be  of 
little  value  no  matter  how  well  the  sampling  program  was  designed. 

Although  analytical  techniques  have  been  standardized  to  a  very   high 
degree  (American  Public  Health  Association  (APHA)  1976),  at  this  time, 
there  are  no  established  standards  for  USDA-Forest  Service  hydro! ogists  to 
follow  when  collecting  and  handling  water  quality  samples  even  though  the 
National  Handbook  of  Recommended  Methods  of  Water  Data  Acquisition  (USGS, 
1977)  exists.  As  a  result,  collection  methods  may  differ  between 
hydro! ogists.  When  analyzing  data,  it  is  generally  taken  for  granted  that 
the  data  are  representative  of  the  water  body  from  which  the  sample  was 
obtained.  However,  this  assumption  can  result  in  erroneous  inferences 
about  the  quality  of  water  body  being  studied,  especially  if  several 
different  individuals  were  involved  in  the  collection  of  the  samples. 
Before  you  compare  data  collected  by  different  individuals,  satisfy 
yourself  that  the  samples  were  collected  and  handled  properly  and  that  the 
data  are  truly  representative  of  the  water  body  from  which  they  were 
collected.  The  methods  of  sample  collection  and  handling  as  well  as  the 
analytical  methods  used  to  measure  each  constituent,  should  be  clearly 
documented  in  the  Water  Quality  Monitoring  Plan  of  Operation. 

The  purpose  of  this  subsection  is  to  discuss  the  types  of  sampling  and 
to  present  guidelines  for  collecting  and  handling  water  quality  samples. 


55 


7.1     Types  of  Samples 

7.1.1  Grab  Samples 

A  grab  sample  is  a  sample  collected  at  a  particular  time  and  place. 
Strictly  speaking,  a  grab  sample  can  represent  only  the  composition  of  the 
water  body  at  that  time  and  place.  However,  when  a  water  body  is  known  to 
be  fairly  constant  in  composition  over  a  considerable  period  of  time  or 
over  substantial  distances  in  all  directions,  then  a  grab  sample  may  be 
said  to  represent  a  longer  time  period  or  a  larger  volume,  or  both,  than 
the  specific  point  at  which  it  was  collected  (APHA,  et  al ,  1976).  When  a 
water  body  is  known  to  vary  with  time,  grab  samples  collected  at  suitable 
intervals  and  analyzed  separately  can  be  of  great  value  in  documenting  the 
extent,  frequency  and  duration  of  these  variations.  Sampling  intervals, 
should  be  selected  on  the  basis  of  the  frequency  with  which  changes  are 
expected. 

7.1.2  Composite  Samples 

In  most  cases,  the  term  "composite  sample"  refers  to  a  mixture  of  grab 
samples  collected  at  the  same  sampling  point  at  different  times  or  to  a 
sample  formed  by  continuously  collecting  a  portion  of  the  flow.  The 
formation  of  a  composite  sample  serves  as  an  alternative  to  the  separate 
analysis  of  a  large  number  of  grab  samples,  followed  by  computation  of  the 
average.  Composite  sampling  can  represent  a  substantial  saving  in 
laboratory  effort  and  funds;  however,  it  should  be  noted  that  this  savings 
in  energy  and  money  is  sometimes  obtained  at  the  expense  of  data 
resolution. 

Composite  samples  can  only  be  used  for  constituents  that  do  not  change 
appreciably  in  character  during  the  interval  from  collection  to  analysis. 

56 


Under  no  circumstances  should  microbiological  samples  be  composited.  If 
preservatives  are  used,  add  them  to  the  sample  bottle  initially  so  that  all 
portions  of  the  composite  are  preserved  as  soon  as  collected. 

7.2  Sample  Collection 

When  samples  are  collected  from  a  stream,  the  sampler  must  consider 
the  variability  of  constituent  concentration  with  streamflow,  depth,  water 
velocity,  distance  from  the  bank  and  distance  from  one  bank  to  the  other. 
It  is  \/ery   important  that  samples  be  collected  during  representative  flows 
over  the  time  period  of  interest.  If  storm  flows  occur,  it  is  important 
that  they  are  sampled.  In  some  cases,  such  as  suspended  solids,  the 
majority  of  mass  transport  will  occur  during  storm  flow  and/or  snowmelt 
runoff.  In  some  cases,  data  resolution  will  require  sample  collection  on 
both  the  rising-limb  and  falling-limb  of  the  hydrograph. 

If  equipment  is  available,  it  is  best  to  take  an  "integrated"  stream 
sample  from  the  water  surface  to  the  stream  bottom  at  selected  intervals 
across  the  channel  in  such  a  way  that  the  sample  is  made  composite 
according  to  flow.  If  only  a  grab  sample  can  be  collected,  it  is  best  to 
take  it  in  the  middle  of  the  stream  at  the  0.6  depth.  Brown  and  others 
(1970),  Guy  (1970)  and  Greeson  and  others  (1977)  discuss  the  various  types 
of  sampling  equipment  in  detail. 

Lakes  and  reservoirs  are  subject  to  considerable  variations  in  water 
quality  from  normal  causes,  such  as  seasonal  stratification,  precipitation, 
runoff  and  wind.  The  choice  of  location,  depth  and  frequency  of  sampling 
will  depend  on  local  conditions  and  the  purpose  of  the  investigation.  A 
detailed  discussion  of  sample  collection  methods  in  lakes  and  reservoirs 


57 


and  equipment  used  to  collect  the  samples  is  presented  by  Lind  (1979), 
Schwoerbel  (1970)  and  Welch  (1948). 

The  chemical  quality  of  ground  water  at  a  sampling  point  may  vary  in 
response  to  changes  in  rate  of  water  movement,  to  pumpage,  or  to 
differences  in  rate  and  chemical  composition  of  recharge  from  precipitation 
and  from  the  surrounding  area  (Brown  and  others,  1970).  Although 
concentrations  of  dissolved  constituents  in  ground  water  from  any  one  well 
may  vary  widely,  sometimes  several  fold,  in  general  the  changes  take  place 
much  slower  than  those  commonly  associated  with  surface  water.  Usually,  it 
is  safer  to  assume  that  the  quality  of  the  water  from  a  well  fluctuates 
rather  than  that  it  is  uniform  for  long  periods  of  time.  Changes  in  ground 
water  quality  usually  can  be  described  satisfactorily  by  a  monthly, 
seasonal  or  annual  sampling  schedule.  For  more  information  about  sampling 
ground  water,  see  Hem  (1970),  Walton  (1970)  and  Freeze  and  Cherry  (1979). 

Samples  should  be  collected  from  wells  only  after  the  well  has  been 
pumped  sufficiently  to  insure  that  the  sample  represents  the  ground  water 
that  feeds  the  well.  Before  samples  are  collected  from  distribution 
systems,  such  as  water  lines  in  a  campground,  flush  the  lines  sufficiently 
to  insure  that  the  sample  is  representative  of  the  water  supply  and 
sterilize  the  water  tap. 

In  all  cases,  sampling  points  should  be  fixed  by  detailed  description, 
by  maps,  or  with  the  aid  of  stakes,  buoys  or  landmarks  in  such  a  manner  as 
to  permit  their  identification  by  other  persons  without  reliance  upon 
memory  or  personal  guidance. 


i  '■>■ 


53 


7.3  Sample  Handling 

A  record  should  be  made  of  every  sample  collected  and  every   sample 
container  should  be  identified,  preferably  by  attaching  an  appropriately 
inscribed  tag  or  label  (APHA,  et  al ,  1976).  The  record  should  contain 
sufficient  information  to  provide  positive  identification  of  the  sample  at 
a  later  date  as  well  as  the  name  of  the  sample  collector,  the  date,  hour 
and  exact  location,  the  water  temperature,  how  the  sample  was  handled  (that 
is  refrigeration,  acidification,  degassing,  etc.).  and  any  other  data 
which  may  be  needed  in  the  future  for  correlation,  such  as  weather 
conditions,  water  level,  stream  flow,  or  the  like. 

After  the  sample  has  been  collected,  care  must  be  exercised  to  protect 
the  integrity  of  the  sample  to  assure  at  the  time  of  analysis  that  it  is 
representative  of  the  water  body  from  which  it  was  collected.  In  general, 
the  shorter  the  time  that  elapses  between  collection  of  a  sample  and  its 
analysis,  the  more  reliable  will  be  the  analytical  results.  For  certain 
constituents,  such  as  pH,  immediate  analysis  in  the  field  is  required  to 
obtain  dependable  results  because  the  sample  composition  may  change  before 
it  arrives  at  the  laboratory. 

It  is  impossible  to  state  exactly  how  much  time  may  be  allowed  to 
elapse  between  collection  of  a  sample  and  its  analysis;  this  depends  on  the 
character  of  the  sample,  the  particular  analyses  to  be  made  and  the 
conditions  of  storage.  Changes  caused  by  the  growth  of  organisms  are 
greatly  retarded  by  keeping  the  sample  in  the  dark  and  at  a  low  temperature 
until  analysis.  Where  the  interval  between  sample  collection  and  analysis 
is  long  enough  to  produce  changes  in  either  the  concentration  or  the 
physical  state  of  the  constituent  to  be  measured,  follow  the  preservation 


59 


Table  5. 

(APHA  and  others, 


Summary  of  special  sampling  or  sample  requirements 
1976;  Stainton  and  others,  1977).  £/ 


Determination 


Container  Sj 


Minimum  Sample 
Size,  ml . 


Storage  and/or  Preservation 


o 


Acidity 

Alkalinity 

BOD 

Boron 

Carbon,  organic,  total 

Carbon  dioxide 
Dissolved  Organic  Carbon 

COD 

Chlorine  dioxide 

Chlorine,  residual 

Chlorophyll 

Color 

Cyanide 

Fluoride 

Fluvial  sediment  £/ 

Grease  and  oil 

Iodine 
Metals 


Nitrogen 
Ammonia 


Nitrate 
Nitrite 


P,  G(B) 
P,  G(B) 
P,  G 
P 
G(brown) 

G 

G 

P,  G 

P,  G 

P,  G 

P,  G 
G 

P,  G 


G,  wide-mouth, 
calibrated 
P,  G 
P,  G 


P,  G 
P,  G 
P,  G 


100 
200 
1,000 
100 
100 

100 
100 

100 

500 
500 
500 
500 
500 

300 


1,000 
500 


500 


100 


100 


24  hr;  refrigerate 
24  hr;  refrigerate 
6  hr;  refrigerate 

Analyze  as  soon  as  possible, 
refrigerate  or  add  HC1  to  pH  <   2 
Analyze  immediately 
Analyze  as  soon  as  possible, 
filter,  refrigerate 
Analyze  as  soon  as  possible, 
add  H2SO4  to  pH  _<  2 
Analyze  immediately 
Analyze  immediately 
30  days  in  dark;  freeze 

24  hr;  add  NaOH  to  pH  12; 
refrigerate 


Add  HC1  to  pH  ^  2 

Analyze  immediately 

For  dissolved  metals  separate  by 

filtration  immediately;  add  5  ml 

cone  HNO3/I 

Analyze  as  soon  as  possible;  add 
0.8  ml  cone  H2SO4/I; 
refrigerate 

Analyze  as  soon  as  possible; 
filter,  add  0.8  ml  cone 
H2SO4/I;  refrigerate 
Analyze  as  soon  as  possible; 
filter,  add  40  mg  HgClg/l  and 
refrigerate  or  freeze  at  -20°C 


Table  5  continued 


CTl 


Determination 


Container  ]>/ 


Minimum  Sample 
Size,  ml . 


Total  Dissolved  Nitrogen  P,  G 


Organic 


Microbiological 
Odor 

Oxygen,  dissolved 
Pesticides  (organic) 

PH 
Phenol 


Phosphorus   (dissolved) 


P,  G 


P,  G 
G 


G,  BOD  bottle 

G(S) 

P,  6(B) 

G 


G(A) 


Orthophosphate  (dissolved)  G(A) 


Total  Dissolved  Phosphorus  G(A) 


Residue  (TDS) 

Salinity 

Silica 

Sulfate 

Sulfide 

Sulfite 

Taste 


100 
500 

500 
500 

300 
500 
100 


100 


100 


P,  G(B) 

100 

G,  wax  seal 

240 

P 

- 

P,  G 

- 

P,  G 

100  ' 

P,  G 

- 

G 

500 

Storage  and/or  Preservation 


Analyze  as  soon  as  possible; 

add  40  mg  HgCl 2/1  and  filter, 

refrigerate 

Analyze  as  soon  as  possible; 

refrigerate  or  add  0.8  ml  cone 

H2S04/1 

6  hr;  refrigerate 

Analyze  as  soon  as  possible; 

refrigerate 

Analyze  immediately 


24  hr;  add  H3PO4  to  pH  <  4.0 

and  1  g  CuS04-H20/l; 

refrigerate 

Analyze  as  soon  as  possible; 

For  dissolved  phosphates  separate 

by  filtration  immediately,  freeze 

at  <  -10°C  and/or  add  40  mg 

HgCl2/I 

Analyze  as  soon  as  possible; 

filter  immediately,  add  40  mg 

HgCl2,  refrigerate. 

Analyze  as  soc n  as  possible, 

filter  immediately,  add  40  mg 

HgCl2/l,  refrigerate. 

Analyze  immediately  or  use  wax  seal 

Refrigerate 

Add  4  drops  2N  zinc  acetate/100  ml 

Analyze  immediately 

Analyze  as  soon  as  possible; 

refrigerate 


Table  5  continued 


Minimum  Sample 
Determination        Container  b/      Size,  ml.        Storage  and/or  Preservation 

Temperature  -  -  Analyze  immediately 

Turbidity  P,  G  -  Analyze  same  day;  store  in  dark  for 

up  to  24  hr 


a/  See  Standard  Methods  (APHA  et  al ,  1976)  and  The  Chemical  Analysis  of  Fresh  Water  (Stainton  et 
al ,  1977)  for  additional  details.  Use  glass  or  plastic  containers,  preferably  refrigerate 
during  storage  and  analyze  as  soon  as  possible.  Samples  for  cation  and  anion  analysis  should  be 
filtered  in  the  field.  For  the  design  of  a  portable  unit  for  filtering  water  samples  at  field 
sites,  see  Kennedy  and  others  (1976). 

b/  P  =  plastic  (polyethylene  or  equivalent);  G  =  glass;  G(A)  or  P(A)  =  rinsed  with  1+1  HNO3; 
G(B)  =  glass,  borosilicate;  G(S)  =  glass,  rinsed  with  organic  solvents. 

c/  Follow  USGS  methods  (Guy  and  Norman,  1970). 


.  .-,y,-.-  •■■-   .-;  ----.,.--^-.  ..,-■■:   .-.-v:y,.:^-i:.-..1  ...■■-:■■: 


practices  outlined  in  Table  5.  Record  the  time  elapsed  between  sampling 
and  analysis,  and  which  preservative,  if  any,  was  added. 

Stainton  and  others  (1977)  suggest  several  special  precautions  when 
sampling  for  nutrient  elements.  The  usually  low  levels  of  these  elements 
in  upland  water  resources  make  contamination  a  significant  problem.  While 
the  need  for  clean  samples  and  sample  containers  is  obvious,  there  are 
several  other  contamination  sources  which  must  be  avoided.  Small  amounts 
of  tobacco  ash,  dandruff  and  perspiration  contributed  by  field  personnel, 
or  plant  pollen  and  other  atmospheric  particulates  all  can  introduce 
significant  errors  into  nutrient  element  analysis.  Field  personnel  must  be 
made  aware  of  these  and  other  possible  sources  of  contamination. 

The  foregoing  discussion  is  by  no  means  all  inclusive.  It  is 
impossible  to  prescribe  absolute  rules  for  the  prevention  of  all  possible 
changes.  Some  advice  will  be  found  in  the  discussions  of  methods  of 
determination  of  various  constituents  in  Standard  Methods  (APHA  and  others, 
1976)  and  The  Chemical  Analysis  of  Fresh  Water  (Stainton  and  others,  1977). 
However,  to  a  large  degree,  the  dependability  of  water  quality  data  must 
rest  on  the  experience  and  good  judgement  of  the  samples  and  analyst. 


63 


8.0  Literature  Cited 

American  Public  Health  Association,  Inc.  and  others  1976.  Standard  methods 
for  the  analysis  for  water  and  waste  water.  13th  ed.  Am.  Public 
Health  Assoc.  874  p. 

Averett,  R.C.  1979.  The  use  of  select  parametric  statistical  methods  for 
the  analysis  of  water  quality  data.  Presented  at  the  USGS-BLM 
Conference  on  Water  Quality  in  Energy  Areas.  January  10-11,  Denver, 
Colorado.  16  p. 

Averett,  R.C.  1977.  Biological  sampling  and  statistics.  In  methods  for 

the  collection  and  analysis  of  aquatic  biological  and  microbiological 

samples:  U.S.  Geol .  Survey  Techniques  Water-Resources  Inv.,  Book  5, 
Chap.  A4,  p.  3-19. 

Averett,  R.C.  1976.  A  guide  to  the  design  of  data  programs  and 

interpretive  projects.  U.S.  Geological  Survey,  Water  Resources 
Division,  Central  Region,  Lakewood,  Colorado  80225.  100  p. 

Brown,  G.  1972.  Forestry  and  water  quality.  Oregon  State  University.  74  p. 

Brown,  E.,  M.W.  Skougstad  and  M.J.  Fishman.  1970.  Methods  for  collection 
and  analysis  of  water  samples  for  dissolved  minerals  and  gases.  U.S. 
Geol.  Survey  Techniques  Water-Resources  Inv.,  Book  5,  Chap.  Al, 
160  p. 

Boynton,  J.L.  1972.  Managing  for  quality  -  A  plan  for  developing  water 
quality  surveillance  programs  on  National  Forests  in  California.  In 
the  Proceedings  of  a  Symposium  on  "Watersheds  in  Transition"  held  at 
Fort  Collins,  Colorado,  June  19-22.  AWRA  Proceedings  series  No.  14. 
p.  84-90. 

Busby,  J.F.  1980.   The  design  and  execution  of  a  groundwater  geochemical 
study.  A  class  handout,  U.S.  Geological  Survey,  Water  Resources 
Division,  Northern  Great  Plains  Aquifer  System  Assessment,  Mail  Stop 
418,  Denver  Federal  Center,  Lakewood,  Colorado  80225.  52  p. 

Cochran,  W.G.  1963.  Sampling  techniques.  John  Wiley  and  Sons,  Inc.,  New 
York,  New  York.  413  p. 

Freese,  F.  1962.  Elementary  forest  sampling.  Agricultural  Handbook  No. 
232.  USDA-Forest  Service.  91  p. 

Freeze,  A.R.  and  J. A.  Cherry.  1979.  Groundwater.  Prentice-Hall,  Inc. 
604  p. 

Greeson,  P.E.,  et  al .  1977.  Methods  for  the  collection  and  analysis  of 
aquatic  biological  and  microbiological  samples:  U.S.  Geol.  Survey 
Techniques  Water-Resources  Inv.,  Book  5,  Chap.  A4,  165  p. 

Guy,  H.P.  1970.  Fluvial  sediment  concepts:  U.S.  Geol.  Survey  Techniques 
Water  Resources  Inv.,  Book  3,  Chap.  CI,  55  p. 

64 


Guy,  H  =  P.  and  V.W.  Norman.  1970.  Field  methods  for  measurement  of  fluvial 
sediment:  U.S.  Geol .  Survey  Techniques  Water-Resources  Inv.,  Book  3, 
Chap.  C2,  59  p. 

Hem,  J.D.  1970.  Study  and  interpretation  of  the  chemical  characteristics 
of  natural  water.  U.S.  Geol.  Survey  Water  Supply  Paper  1473.  363  p. 

Huibregtse,  K.R.  and  J.H.  Moser.  1976.  Handbook  for  sampling  and  sample 
preservation  of  water  and  waste  water.  U.S.  Department  of  Commerce, 
National  Technical  Information  Service.  PB-259-946.  257  p. 

Kennedy,  V.C.,  E.A.  Jenne  and  J.M.  Burchard.  1976.  Backflushing  filters 
for  field  processing  of  water  samples  prior  to  trace-element  analysis. 
U.S.  Geological  Survey,  Open  file  report  76-126.  11  p. 

Krygier,  J.T.  and  J.D.  Hall.  1971.  Proceedings  of  a  symposium  forest  land 
uses  and  stream  environment.  Oregon  State  University.  252  p. 

Lind,  O.T.  1979.  Handbook  of  common  methods  in  limnology.  Mosby  Company, 
St.  Louis,  Missouri.  199  p. 

McKee,  J.E.  and  H.W.  Wolf.  1963.  Water  quality  criteria.  California 
State  Water  Resources  Control  Board.  Publication  No.  3-A.  548  p. 

McNeely,  R.N.,  V.P.  Neimanis  and  L.  Dwyer.  1979.  Water  quality  sourcebook 
-  a  guide  to  water  quality  parameters.  Inland  Water  Directorate, 
Water  Quality  Branch,  Ottawa,  Canada.  Cat.  No.  En  37-541  1979.  89  p. 

Mendenhall ,  W.,  L.  Ott  and  R.L.  Schaeffer.  1971.  Elementary  survey 

sampling.  Wadsworth  Publishing  Company,  Inc.,  Belmont,  California. 
247  p. 

Ponce,  S.L.  1980.  Statistical  methods  commonly  used  in  water  quality  data 
analysis.  WSDG  Technical  Paper  WSDG-TP-00001.  WSDG,  USDA  -  Forest 
Service,  3825  E.  Mulberry  St.,  Fort  Collins,  CO  80524.  152  p. 

Potyondy,  J.  1977.  Guidelines  for  water  quality  sampling.  Determination 
of  detection  limits  and  sample  sizes.  WQ-3  East  Fork  Smiths  Fork 
Barometer  Watershed,  Wasatch  National  Forest,  Intermountain  Region, 
USDA  Forest  Service,  Ogden,  Utah.  8  p. 

Potyondy,  J.  1980.  Guidelines  for  water  quality  monitoring  plans.  Draft 
USDA-Forest  Service,  Intermountain  Region,  Soil  and  Water  Management, 
324  25th  Street,  Ogden,  UT  84401  49  p. 

Schwoerbel ,  J.  1970.  Methods  of  hydrobiology.  Pergamon  Press,  Limited. 
Oxford,  England.  200  p. 

Snedecor,  G.W.  and  W.G.  Cochran.  1967.  Statistical  methods,  6th  ed.,  Iowa 
State  Univ.  Press,  Ames,  Iowa.  593  p. 


65 


faam-rinaig-fr  ■^mBaaaaaiiiaaiE^agei:.  ■.BJaEpmaMfev-  ■.         


Stainton,  M.P.,  M.J.  Capel ,  and  F.A.J.  Armstrong.  1977.  The  chemical 

analysis  of  freshwater.  Fisheries  and  Environment  Canada.  Fisheries 
and  Marine  Service.  Misc.  Spec.  Pub.  No.  25.  Winnipeg,  Manitoba 
180  p. 

Thatcher,  L.L.,  V.J.  Janzer  and  K.W.  Edwards.  1977.  Methods  for 
determination  of  radioactive  substances  in  water  and  fluvial 
sediments:  U.S.  Geol .  Survey  Techniques  Water  Resources  Inv.,  Book 
5,  Chap.  A5.  95  p. 

U.S.  EPA.  1971.  Studies  on  effects  of  watershed  practices  on  streams. 
Prepared  Oregon  State  University  School  of  Forestry.  13010  EGA 
02/71.  173  p. 

U.S.  EPA.  1973.  Processes,  procedures  and  methods  to  control  pollution 
resulting  from  silvicultural  activities.  EPA  430/9-73-010.  91  p. 

U.S.  EPA.  1976a.  Forest  harvest,  residue  treatment,  reforestation  and 
protection  of  water  quality.  EPA  910/9-76-020.  273  p. 

U.S.  EPA.  1976b.  Quality  criteria  for  water.  Washington,  D.C.  256  p. 

U.S.  EPA.  1977.  Silvicultural  chemicals  and  protection  of  water  quality. 
EPA  910/9-77-036.  224  p. 

U.S.  Forest  Service.  1980.  An  approach  to  water  resources  evaluation 
non-point  sources  silviculture.  Produced  under  USFS-EPA  amended 
interagency  agreement  EPA-IAG-D6-0660.  816  p. 

USGS.  1977.  National  handbook  of  recommended  methods  for  water-data 
acquisition.  U.S.  Department  of  Interior.  Reston,  Virginia. 

Walton,  W.C.  1970.  Groundwater  resource  evaluation.  McGraw-Hill.  N.Y., 
N.Y.  664  p. 

Welch,  P.S.  1948.  Limnological  Methods.  McGraw-Hill.  N.Y.,  N.Y.  381  p. 


66 


APPENDIX 


Table  A-l.     Values  of  t  (Steel   and  Torrie,  1960). 


if 

Probability  of  a  larger  value  of  t,  sign  ignored 

0.5 

0.4 

0.3 

0.2           0.1 

0.05 

0.02      1   0.01 

0.001 

1 

1.000 

1.376 

1.963      3.078 

6.314 

12.706 

31.821 

63 . 657 

636.619 

2 

.816 

1.061 

1.386      1.886 

2.920 

4.303 

6.965 

9.925 

31.598 

3 

.765 

.978 

1.250      1.638 

2.353 

3.182 

4.541 

5.841 

12.941 

4 

.741 

.941 

1.190 

1.533 

2.132 

2.776 

3.747 

4.604 

8.610 

5 

.727 

.920 

1.156 

1.476 

2.015 

2.571 

3.365 

4.032 

6.859 

6 

.718 

.906 

1.134 

1.440 

1.943 

2.447 

3.143 

3.707 

5.959 

.711 

.$96 

1.119 

1.415 

1.895 

2 .  365 

2.998 

3 .  499 

5.405 

8 

706 

.889 

1.108 

1.397 

1.860 

2.306 

2.896 

3  355 

5.041 

9 

.703 

.883 

1.100 

1.383 

1.833 

2.262 

2.821 

3.250 

4.781 

10 

.700 

.879 

1.093 

1.372 

1.812 

2.228 

2.764 

3.169 

4.587 

11 

.697 

.876 

1.088 

1.363 

1.796 

2.201 

2.718 

3.106 

4.437 

12 

.695 

.873 

1.083 

1.356 

1.782 

2.179 

2.681 

3.055 

4.318 

13 

.694 

.870 

1.079 

1.350 

1.771 

2.160 

2.650 

3.012 

4.221 

14 

.692 

.868 

1.076 

1.345 

1.761 

2.145 

2.624 

2.977 

4.140 

15 

.691 

.866 

1.074 

1.341 

1.753 

2.131 

2.602 

2.947 

4.073 

16 

.690 

.865 

1.071 

1.337 

1.746 

2.120 

2.583' 

'2.921 

4.015 

17 

.689 

.863 

1.069 

1.333 

1.740 

2.110 

2.567 

2 .  898 

3.965 

18 

.688 

.862 

1.067 

1.330 

1.734 

2.101 

2.552 

2.878 

3.922 

19 

'.688 

.861 

1.066 

1.328 

1.729 

2.093 

2.539 

2.S61 

3.883 

20 

.687 

.860 

1.064 

1.325 

1.725 

2.086 

2.528 

2.845 

3.850 

21 

.686 

.859 

1.063 

1.323 

1.721 

2.080 

2.518 

2.831 

3.819 

22 

.686 

.858 

1.061 

1.321 

1.717 

2.074 

2.508 

2.819 

3.792 

23 

.  6S5 

.858 

1.060 

1.319 

1.714 

2.009 

2.500 

2.807 

3.767 

24 

.685 

.857 

1 .  059 

1.318 

1.711 

2.004 

2.492 

2.797 

3.745 

25 

.684 

.856 

1.058 

1.316 

1.708 

2.000 

2.485 

2.787 

3.725 

26 

.684 

.856 

1.058 

1.315 

1.706 

2.056 

2.479 

2.779 

3.707 

27 

.684 

.855 

1 .  057 

1.314 

1.703 

2.052 

2.473 

2.771 

3.690 

28 

.683 

.855 

1.056 

1.313 

1.701 

2.048 

2.467 

2.763 

3.674 

20 

.683 

.854 

1.055 

1.311 

1 .  699 

2 .  045 

2.462 

2 .  756 

3.659 

30 

.683 

.854 

1.055 

1.310 

1 .  697 

2.042 

2.457 

2.750 

3.646 

40 

.681 

.851 

1.050 

1.303 

1.684 

2.021 

2.423 

2.704 

3.551 

60 

.679 

.848 

1.046 

1 .  296 

1.671 

2.000 

2.390 

2 .  600 

3.460 

120 

.677 

.845 

1.041 

1.280 

1 .  (158 

1 .  980 

2.358 

2.617 

3.373 

a? 

.674 

.842 

1.030 

1 .  282 

1 .  645 

1.900 

2.326 

2.576 

|     3.291 

1 

0.25 

0.2 

0.15 

0.1 

0.05 

0.025 

0.01      |    0.005 

0.0005 

* 

< 

1 

'robability  of  a  larger  value  of  t,  sign 

considered 

*   U.S.  GOVERNMENT  PRINTING  OFFICE:  1981-779-389(285  Region  no.  8 


THE  USE  OF  THE  PAIRED-BASIN  TECHNIQUE 
IN  FLOW-RELATED  WlLDLAND  WATER-QUALITY  STUDIES 


By 


Stanley  L«  Ponce 
Watershed  Systems  Development  Group 
USDA  Forest  Service 
Fort  Collins,  Colorado 

David  VI.  Schindler 

Department  of  Fisheries  and  Oceans 

Freshwater  Institute 

Winnipeg,  Manitoba 

Robert  C-  Averett 

Water  Resources  Division  -  Central  Region 

U.S.  Geological  Survey 

Denver  Federal  Center 

Denver,  Colorado 


WSDG  Report 
WSUG-TP-UU004 

April  1982 


USDA  Forest  Service 

Watershed  Systems  Development  Group 

5823  East  Mulberry  Street 

Fort  Collins,  Colorado  80b24 


V. 


TABLE  OF  CONTENTS 

Page 

INTRODUCTION  1 

THE  PAIRED-BASIN   TECHNIQUE  4 

Natural  Correlation  6 

Stability  of  the  Control  8 

Satisfying  the  Assumptions  Underlying  the  ANCOVA  9 

Quality  and  Size  of  Data  Base  11 

APPLICATIONS  OF   THE  PAIRED-BASIN   TECHNIQUE  12 

Cause-and-Effect  Evaluation  12 

Trend  Analysis  14 

Cumulative- Impact  Analysis  15 

CONCLUSIONS  19 

ACKNOWLEDGEMENTS  19 

LITERATURE  CITED  20 


LIST  OF  FIGURES 


/ 


Page 


Figure  1.       Location  of  stations  A  and  B  in  relation  to 

the  harvested  area  on  Trout  Creek.  2 

Figure  2.       Suspended  sediment  (SS)  rating  curves  for  stations 
A  and  B.     Data  collected  at  station  A  are  denoted 
by  °,  where  as  data  collected  at  station  B  are 
are  denoted  by  +.  3 

Figure  3.       An  example  of  the  paired-basin  technique  using  two 

separate  basins.  5 

Figure  4.       An  example  of  the  paired-basin  approach  using  a 

nested  basin.  5 

Figure  5.       Location  of  principal   sampling  stations  on  Bull 

Run  Watershed,  Mount  Hood  National   Forest,  Oregon.  7 

Figure  6.       An  example  of  cause-and-effect  monitoring  when 

the  treatment  can  be  isolated.  12 

Figure  7.       Before  and  after  treatment  regressions  of 

suspended  sediment  at  station  A  (SS^)  against 

suspended  sediment  at  station  B  (SSg).  13 

Figure  8.       An  example  of  cause-and-effect  monitoring  when 

the  treatment  cannot  be  isolated.  13 

Figure  9a.     Paired  plots  of  turbidity  (TURB)  at  stations 

A  and  C  for  1977-80  water  years.  14 

Figure  9b.     Paired  plots  of  turbidity  (TURB)   at  stations 

B  and  C  for  1977-80  water  years.  14 

Figure  10.     Nested  station  design  for  assessment  of 

cumulative  impacts.     Station  F  is  the  control.  16 

Figure  11.     Suspended  Sediment  (SS)  plots  for  stations 
located  in  the  Clark  Fork  basin  (A,  B,  C,  D, 
and  E)  versus  the  control    (station  F)   located  in 
the  Lewis  Fork  basin.  17 


MBBHnm^^^^HBBiH^H^^^HHHHHlHHM^^HUMBBB!!^: 


LIST  OF    TABLES 


Page 


Table  1.     Coefficients  of  determination  for  selected  water- 
quality  characteristics  from  paired-basin  analysis 
on  the  Bull   Run  Watershed,  Mount  Hood  National 
Forest.  8 

Table  2.     Summary  of  Consequences  of  Violation  of  Assumptions 

of  ANCOVA  (Glass,   Peckham,   and  Sanders  1972).  10 


Table  3.     Regression  Coefficient  -  Land  Use  Characteristics 
Chart  for  Basin  C. 


18 


THE  USE  OF   THE  PAIRED-BASIN  TECHNIQUE 
IN  FLOW-RELATED  WILDLAND  WATER-QUALITY   STUDIES 

INTRODUCTION 

One  of  the  responsibilities  of  a  forest  hydro! ogist  is  to  provide  the 
line  officer  with  reliable  information  concerning  the  quality  of  the  water 
resource.     This  information  is  used  to   (1)  evaluate  the  effectiveness  of 
soil    and  water  conservation  practices,   (2)   determine  if  compliance  to 
public  health  standards  and/or  contractual   obligations  is  being 
accomplished,  (3)   determine  water  quality  trends,  and/or  (4)  evaluate  the 
existing  condition  of  water  quality.     The  problem,  of  course,   is  how  to 
provide  this  information  at  an  acceptable  level   of  reliability  when 
constraints  on  time,  manpower,  and  money  limit  the  number  of  water  quality 
samples  that  can  be  collected  and  analyzed  each  year. 

The  success  of  the  monitoring  activity  is  dependent  on  the  monitoring 
design  and  the  method  selected  for  data  analysis.     Once  the  study 
objectives  have  been  defined,  the  goal    is  to  obtain  the  required 
information  at  a  predetermined  level   of  reliability  with  a  minimum 
expenditure  of  resources.   This  requires  that  the  monitoring  program  and 
subsequent  data  analysis  be  designed  to  minimize  unexplained  variation.     In 
studies  involving  streams,  most  water-quality  constituents  of  interest  to 
the  wild! and  hydrol ogist  are  strongly  related  to  discharge.     To  account  for 
the  variation  due  to  flow,  hydrol ogists  commonly  use  regression  techniques 
to  evaluate  possible  cause-and-effect  relationships  as  well   as  temporal 
trends. 

The  most  frequently  used  regression  is  simply  a  plot  of  the  discharge 
against  the  concentration  of  a  given  water-quality  constituent.     An 
application  of  this  approach  is  illustrated  below. 

1 


The  question  confronting  the  hydrologist  was:     "Does  the  harvested 
area  significantly  affect  the  suspended  sediment  loading  in  Trout  Creek 
during  water  year  1980?"     Sampling  stations  were  placed  upstream  and 
downstream  from  the  harvested  area  (Figure  1).     Suspended-sediment 
concentrations  were  measured  at  both  stations  such  that  each  part  of  the 
annual   streamflow  (baseflow,  snowmelt,   stormflow,  etc.)  was  sampled  during 
water  year  1980.     The  data  were  then  fit  to  the  regression  model: 

log  SS  =  log  b0  +  b1  log  Q 
where: 

SS  =  suspended-sediment  concentration,   in  milligrams  per  liter; 

bQ  =  regression  coefficient; 

b,  =  regression  coefficient;  and 

Q  =  discharge,   in  cubic  feet  per  second. 
The  results  were  then  used  to  develop  the  sediment-rating  curves 
illustrated  in  Figure  2. 


h 


ID 


TROUT  CREEK 


HARVESTED   AREA 


Figure  1.  Locations  of  stations  A  and  B  in  relation  to  the  harvested  area 
on  Trout  Creek. 


•^ 


It  should  be  noted  that  the  data  are  widely  scattered  about  the 
regression  lines  (Figure  2).     The  coefficient  of  determination  (r2) 
generally  ranges  between  0.60  and  0.85  for  most  rating-curve  regressions 
involving  water  quality  constituents.      The  unexplained  variation   (1   -  r2) 
in  the  regression  is  due  to  factors  not  accounted  for  by  the  relationship, 
such  as  watershed  conditioning,   climate,   and/or  physical    and  biologic 
factors  (Beschta  et  al .   1981).     Although  hydrologists  typically  seek  to 
minimize  the  unexplained  variation  by  judiciously  selecting  sampling 
periods,  it  is  rare  that  the  r2  will   exceed  0.85.     Tne  statistical 
difference  between  A  and  B  can  be  determined  with  the  analysis  of 
covariance  (ANCOVA)   test  of  a  common  line  or  with  the  Chow  test  (Wilson 
1978)  if  the  data  meet  the  underlying  assumptions  of  the  statistical 
tests. 


ae 
B 

,+  B 

X 

Q*1^ 

UJ  (/) 

VS  Ifl 

0 

C&~ 

+ 

8S 

LlJ  (— 

+■ 

o 

^A 

Q.«C 

en  oz 

3  1— 

V3  Z 

r     ° 

u,  u 

+ 

+ 

°s 

0 

© 

O  C...1 

,,+ 

+    _ 

o 

/ 

s 

+ 

V 

0 

o 

e 

o 

o 

e 

LOG  OF  DISCHARGE  (Q) 


Figure  2.  Suspended  sediment  (SS)  rating  curves  for  stations  A  and  B. 
Data  collected  at  station  A  are  denoted  by  °,  where  as  data 
collected  at  station  B  are  denoted  by  +. 


We  have  found  that  in  some  situations  an  extension  of  the  paired- 
basin  technique,  to  be  discussed  in  the  section  that  follows,  provides  for 
greater  statistical   control    and  enables  the  hydro! ogist  to  maximize  ^^) 

information  gained  while  minimizing  time,  manpower,  and  other  economic 
expenditures.     The  purpose  of  this  paper  is  to  discuss  the  paired-basin 
technique  and  illustrate  its  use  for  possible  cause-and-effect  evaluation, 
trend  analysis,  and  assessment  of  cumulative  impacts. 

THE  PAIRED  BASIN   TECHNIQUE 

The  paired-basin  technique  was  first  used  by  U.S.  Forest  Service 
hydrologists  on  the  "Wagon  Wheel  Gap  Streamflow  Experiment"   (Bates  and 
Henry  1928).     Today  the  technique  commonly  is  used  by  hydrologists  to 
quantify  the  effects  of  land-use  practices  on  the  volume  and  timing  of 
streamflow.     In  recent  years,  the  technique  has  been  extended  to  flow- 
related,  water-quality  studies  by  a  few  investigators  (Averett,  Ponce,  and 
Schindler  1981;   Schindler  et  al .  1980;   Singh  and  Kalra  1972;   Thut  and  Haydu 
1971;  Brown  and  Krygier  1971)  and  found  to  be  an  effective  data  analysis 
tool . 

The  paired-basin  technique  uses  two  basins  as  nearly  alike  as 
possible.     Ideally,  both  basins  are  about  the  same  size  and  have  similar 
soils,  vegetation,  elevation,  aspect,  climate,  and  streamflow 
characteristics.     Traditionally,  the     paired-basin  technique  uses  two 
separate  basins:     a  control   basin  providing  a  standard  for  comparison  and  a 
treatment  basin  (Figure  3).     However,  in  many  water-quality  studies  an 
upstream  and  downstream  sampling  method  is  used  to  isolate  a  treatment  area 
along  a  stream  reach  (Figure  4).     The  paired-basin  technique  also  can  be 
used  in  this  situation.     Instead  of  two  completely  separate  basins,  the 
control   basin  (that  drainage  area  upstream  from  the  upper  sampling  site)   is 
nested  within  the  treatment  basin. 


k) 


k_y 


TREATMENT 


CONTROL 
WATERSHED 


TREATMENT 
WATERSHED 


Figure  3.     An  example  of  the  paired-basin  technique  using  two  separate 
basins. 


TREATMENT 


Figure  4.     An  example  of  the  paired-basin  technique  using  a  nested 
subbasin. 


In  either  case,  traditional    or  nested  design,  the  technique  requires 
that  data  are  collected  both  before  and  after  treatment  at  both  basins 
(stations).     In  the  case  of  the  traditional   design,  prior  to  treatment, 
water-quality  measurements  (paired  in  time)  are  collected  from  both  basins 
throughout  the  hydro! ogic  regime  of  interest.     These  data  are  used  to 
establish  the  calibration-period  regression  of  a  water-quality  constituent 
of  one  basin  upon  the  other.     Following  calibration,  the  treatment  basin  is 
treated  and  the  collection  of  water-quality  data  are  continued  in  both 
basins.     The  post- treatment  data  are  used  to  develop  the  treatment-period 
regression.     The  two  regressions  (calibration  and  post-treatment)     are  then 
compared  using  ANCOVA  to  determine  if  there  is  a  statistically  significant 
difference  in  the  water-quality  characteristics. 

There  are  several    factors  that  affect  the  success  of  the  paired- 
basin  technique  (Reinhart  1967).     Those  that  need  to  be  considered 
carefully  when  applying  the  technique  include  natural   correlation, 
stability  of  the  control,   satisfying  the  assumptions  underlying  ANCOVA,  and 
quality  and  size  of  the  data  base. 

Natural  Correlation 

The  degree  of  correlation  that  exists  naturally  between  paired  basins 
for  a  given  water-quality  property  or  constituent  is  of  primary  importance. 
Suspended  sediment,  turbidity,  and  electrical   conductivity  usually 
correlate  well    for  basins  that  are  similar.     This  point  is  illustrated 
using  two  sets  of  basin  pairs  within  the  Bull   Run  Watershed  on  the 
Mount  Hood  National   Forest,  Oregon  (Figure  5).     The  paired  basins  used  met 
the  underlying  criteria  of  similarity  in  elevation,  aspect,   soils, 
vegetation,  climate,  and  streamflow.     Basin  44  served  as  the  control   and 
was  paired  with  treatment  basins  18  and  35  (Table  1). 


0 


Figure  5.  Location  of  principal  sampling  stations  on  the  Bull  Run  Watershed,  Mount  Hood 
National  Forest,  Oregon. 


Table  1.     Coefficients  of  determination  (r2)  for  selected  water- 
quality  characteristics  from  paired-basin  analysis  in  the 
Bull   Run  Watershed,  Mount  Hood  National   Forest. 


Paired  Basins 

Chare  ten*  stic 

Years  of  Record 

r± 

n 

44  and  18 

Suspended  Solids 

76-79 

0.98 

90 

44  and  35 

Suspended  Solids 

76-79 

0.94 

98 

44  and  18 

Turbidity 

76-79 

0.95 

72 

44  and  35 

Turbidity 

76-79 

0.87 

163 

44  and  18 

Electrical 
Conductivity 

76-79 

0.91 

185 

44  and  35 

Electrical 
Conductivity 

76-79 

0.90 

191 

In  general,  a  decrease  in  similarity  between  basins  results  in  a 
decreased  degree  of  correlation.     For  example,  during  stormflow  it  is 
likely  that  the  hydrographs  will   be  out  of  phase  if  the  paired  basins  are 
not  similar  in  size.     Consequently,  at  a  given  point  in  time  the  flow 
characteristics  at  each  sampling  station  will  be  different  rather  than 
similar.     Such  a  condition  will   add  unwanted  variation  to  the  relationship 
and  reduce  the  strength  of  the  procedure. 

Stability  of  the  Control 

It  is  important  that  the  control   basin  remain  as  stable  as  possible. 
Any  factor  that  changes  the  character  of  the  control  will   detract  from  the 
usefulness  of  the  method.     Consequently,  when  selecting  a  control   basin 
care  needs  to  be  taken  to  select  one  in  or  as  near  a  state  of  equilibrium 
as  possible.     Often  it  is  useful   to  select  two  control   basins  in  case  one 
is  altered  during  the  study  period,  such  as  by  fire  or  some  other 
catastrophic  event. 

8 


Satisfying  the  Assumptions  Underlying  the  ANCOVA 

The  validity  of  the  inferences  drawn  from  the  results  of  the  ANCOVA  is 
related,  to  a  greater  or  lesser  extent,  to  whether  the  underlying 
assumptions  are  satisfied.     The  relevant  question  is  not  whether  the  ANCOVA 
assumptions  are  met  exactly,  but  rather  whether  the  plausible  violations  of 
the  assumptions  have  serious  consequences  on  the  validity  of  probability 
statements  based  on  the  standard  assumptions.     The  primary  assumptions  that 
need  to  be  considered  are  (1)   independence  of  errors,   (2)  normality,  and 
(3)  homogeneity  of  the  variances.     The  consequences  of  violation  of 
assumptions  of  ANCOVA  are  summarized  in  Table  2. 

It  should  be  pointed  out  that  it  is  not  uncommon  to  find  in  a  time 
series  of  hydrologic  data  that  an  observation  at  one  time  period  (t)   is 
correlated  with  the  observation  in  the  preceeding  time  period  (t  -  1)  or 
time  periods  (t  -  2,  etc.).     In  other  words,  an  observation  collected  at 
time  t  may  not  be  independent  of  one  collected  at  time  t  -  1  or  t  -  2S  etc. 
when  the  time  intervals  are  short.     Such  dependency  is  termed  serial 
correlation  or  autocorrelation. 

What  is  the  effect  of  serial   correlation  on  tests  of  significance 
regarding  regression  equations?     Essentially,  if  a  significant  level   of 
serial  correlation  exists  the  data  are  not  independent  and  tests  of 
significance  regarding  any  regression  equations  have  limited  utility 
(Table  2). 

This  raises  the  question:     "How  frequently  can  observations  be 
collected  while  still   maintaining  independence?"     Unfortunately  there  is 
not  a  simple  answer  to  this  question*     Identifying  the  characteristics  and 
structure  of  serial   correlation  in  time  series  data  of  water  quality 
constituents  represents  one  of  the  important  areas  of  research  facing 
statisticians.     However,   it  is  suggested  by  Beschta  (1981)  that 


Table  2.     Summary  of  Consequences  of  Violation  of  Assumptions  of  ANCOVA  (Glass,  Peckham, 
and  Sanders  1972). 


Type  of  Violation 


Effect  on  Level  of 
Significance  (a) 


Equal  n'o 


Unequal  n's 


Effect  on  Power 


Effect  on  Level  of 
Significance  (a) 


Effect  on  Power 


Non- Independence 
of  errors 


Non- Independence  of  errors  seriously  affects  both  the  level  of  significance  and  power  of  the  F-test  regardless  whether  n's  are 
equal  or  unequal. 


Non-normality 
Skewness 

Kurtoa Is 


Heterogeneous 
Variances 


Combined  non- 
normality  nnd 
heterogeneous 
variances 


Skewed  populations  have  very  little  effect  on  either  the  level  of  significance  or  the  power  of  the  fixed-effects  model  F-test; 
distortions  of  nominal  significance  levels  of  power  values  are  rarely  greater  than  a  few  hundredths.   (However,  skewed  populations 
can  seriously  affect  the  level  of  significance  and  power  of  directional  -  or  "one-tailed"  -  tests.) 


Actual  a  Is  less  than 
nominal  a  when  popula- 
tions are  leptokurtlc 
(I.e.,  62>3)-   Actual 
n  exceeds  nominal  a  for 
platykurtlc  populations. 
(Effects  are  slight.) 


Very  slight  effect  on  a, 
which  Is  seldom  distorted 
by  more  than  a  few 
hundredths.   Actual  a 
seems  always  to  be 
slightly  Increased  over 
the  nominal  a. 


Actual  power  Is  less  than 
nominal  power  when  popula- 
tions are  platykurtlc. 
Actual  power  exceeds 
nominal  power  when  popula- 
tions are  leptokurtlc. 
Effects  can  be  substantial 
for  small  n. 

(No  theoretical  power  value 
exists  when  variances  are 
he  te  rogeneous . ) 


Actual  a   is  less  than 
nominal  a   when  popula- 
tions are  leptokurtlc 
(I.e.,  B2>3).  Actual 
a   exceeds  nominal  a  for 
platykurtlc  populations. 
(Effects  are  slight.) 


a  may  be  seriously 
affected.   Actual  a 
exceeds  nominal  a  when 
smaller  samples  are 
drawn  from  more  vari- 
able populations; 
actual  a  Is  less  than 
nominal  a  when  smaller 
samples  are  drawn  from 
less  variable  populations. 


Actual  power  Is  less  than 
nominal  power  when  popula- 
tions are  platykurtlc. 
Actual  power  exceeds 
nominal  power  when  popula- 
tions are  leptokurtlc. 
Effects  can  be  substantial 
for  small  n's. 

(No  theoretical  power  value 
exists  when  variances  are 
heterogeneous . ) 


Non-normality  and  heterogeneous  variances  appear  to  combine  additively  ("non-lnteractlvely")  to  affect  either  level  of  significance 
or  power.   (For  example,  the  depressing  effect  on  a  of  leptokurtosls  could  be  expected  to  be  counteracted  by  the  elevating  effect 
on  a  of  having  drawn  smaller  samples  from  the  more  variable,  leptokurtlc  populations.) 


% 


observations  collected  during  stormflow  should  have  an  interval   of  three 
or  more  hours.     During  snowmelt  runoff  and  1 ow  flow  peri ods ,  it  appears 
that  observations  need  to  be  obtained  two  or  more  weeks  apart  to  assure 
independence.     If  the  observations  are  equally  spaced  in  time,  the  serial 
correlations  can  be  tested  using  the  BMDP2T  Program  (BMDP  1981)  which  is 
readily  available  on  the  computer  at  the  Fort  Collins  Computer  Center. 

For  further  reading  about  the  assumptions  underlying  the  ANCQVA,  see 
Elashoff  (1969);  Glass,  Peckham,  and  Sanders  (1972);  and  Wildt  and  Ahtola 
(1978). 

Quality  and  Size  of  Data  Base 

Adequate  and  correct  data  are  essential   to  the  success  of  any  study. 
No  extent  of  statistical  maneuvering  can  make  up  for  sloppy  data.     Because 
many  individuals  may  be  involved  in  data  collection,  it  is  good  practice  to 
establish  written  data-collection  standards  and  insure  that  they  are 
adhered  to  throughput  the  study. 

In  general,  the  larger  the  statistical   sample,  the  more  precise  the 
paired-basin  regression  relationship  will  be.     Wilm  (1949),  Kovner  and 
Evans  (1954),  and  Kovner  (1968)  describe  methods  for  determining  the 
minimum  length  of  streamflow  experiments  using  the  paired  basin  technique. 
However,  these  methods  cannot  be  applied  to  water  quality  experiments  for 
determination  of  the  sample  size  which  will  yield  regression  estimates  at  a 
predetermined  level   of  statistical    reliability  because  they  assume  equal 
variances  and/or  that  the  slopes  of  the  regression  lines  are  the  same  for 
the  calibration  and  treatment  periods. 

At  this  time,  we  know  of  no  procedures  available  to  the  hydrologist  to 
determine  a  specific  sample  size  which  will   permit  a  comparison  test  at  a 
predetermined  level   of  statistical   reliability.     Our  advice  is  that  a 
minimum  of  15  observations  be  collected  per  station  per  year.     It  is 

11 


important,  of  course,  that  the  samples  are  collected  throughout  the 
sampling  period  relative  to  the  relationship  between  the  flow 
characteristics  and  water  quality  constituent  of  concern. 


0 


APPLICATIONS  OF   THE  PAIRED-BASIN   TECHNIQUE 


Cause-and-Effect  Evaluation 

The  paired-basin  technique  is  ideal    for  evaluating  possible 
cause-and-effect  relationships.     Consider  the  situation  illustrated  in 
Figure  6.     Here  we  have  a  treatment  isolated  by  placing  stations  upstream 
(station  A)  and  downstream  (station  B)   from  the  treatment.     The  problem  is 
to  determine  the  effect  of  the  treatment  on  a  specific  water-quality 
characteristic,  such  as  suspended  sediment.     The  strategy  to  be  used  in 
this  situation  is  to  establish  a  pair  of  basins  (stations)  A  and  B  and 
collect  data  before  and  after  the  treatment. 


Figure  6. 


An  example  of  cause-and-effect  monitoring  when  the  treatment 
can  be  isolated. 


12 


The  data  can  be  related  as  illustrated  in  Figure  7.  Analysis  of 
covariance  can  be  used  to  determine  if  the  treatment  had  a  statistically 
significant  effect  on  the  suspended  sediment  of  the  system. 


SS, 


AFTER    TREATMENT 


BEFORE    TREATMENT 


SS, 


Figure  7.     Before  and  after  treatment  regressions  of  suspended  sediment  at 
station  A  (SSa)  against  suspended  sediment  at  station  B 
(SSB). 

Another  example  of  possible  cause-and-effect  evaluation  is  presented 

in  Figure  8,     In  this  example,  the  treatment  cannot  be  isolated  by  placing 

stations  upstream  and  downstream  from  it.     Consequently,  a  control   basin 

(A  in  Figure  8)  needs  to  be  selected  and  data  collected  both  before  and 

after  the  treatment.  Data  analysis  would  be  similar  to  that  previously 
described. 


CONTROL 
WATERSHED 


TREATMENT 
WATERSHED 


Figure  8.     An  example  of  cause-and-effect  monitoring  when  the  treatment 
cannot  be  isolated. 


13 


Trend  Analysis 

The  paired-basin  technique  can  also  be  used  for  trend  analysis.     The 
data  from  the  test  station  is  compared  with  the  data  from  the  control 
station  throughout  a  series  of  time  intervals  of  interest,   such  as  seasons 
or  years.     A     regression  relationship  is  developed  for  each  time  interval 
and  the  trend  in  water  quality  evaluated  by  comparing  the  slope  and 
intercept  of  the  regressions. 

Consider,  for  example,  three  baseline  stations:     A  and  B  represent 
actively  managed  basins  while  station  C  represents  the  control   basin.     The 
hydrologist  would  like  to  determine  if  there  is  a  trend  in  the  turbidity  on 
an  annual   basis.     Paired- in- time  data  were  collected  at  each  station  during 
the  1977-80  water  years.     The  resultant  regressions  are  presented  in 
Figures  9a  and  9b.     The  data  can  be  analyzed  using  a  multiple  comparison 
approach  where  successive  regressions  are  compared  using  ANCOVA. 


TURBA 


TURBC 


TURBB 


TURBC 


Figure  9a.     Paired  plots  of 
turbidity  (TURB)   at  stations 
A  and  C  for  1977-80  water 
years. 


Figure  9b.     Paired  plots  of  turbidity 
(TURB)   at  stations  B  and  C  for  1977-80 
water  years. 


14 


Turbidity  at  station  A  is  increasing  annually  relative  to  the 
turbidity  at  station  C  (Figure  9a).  Whether  or  not  the  source  is  related 
to  management  activities  cannot  be  determined  from  the  paired  plot  alone 
but  requires  onsite  observation  and  interpretation  by  the  hydro! ogist.  It 
is  evident,  however,  that  there  is  a  definite  trend  in  the  relation. 
Figure  9b  indicates  little  change  in  the  relative  relationship  in  turbidity 
between  stations  B  and  C.  This  indicates  that  the  management  activities 
used  in  basin  B  throughout  the  period  of  study  did  not  change  the  turbidity 
yield  from  basin  B  relative  to  the  control  basin  or  station  C. 

Cumulative- Impact  Analysis 

The  paired-basin  technique  can  be  an  effective  analysis  tool  for 
assessment  of  cumulative  impacts.  The  procedure  is  to  develop  a  series  of 
nested  stations  (subbasins)  throughout  the  basin.  Paired  sampling  is  used 
and  the  water  quality  at  each  station  is  correlated  with  that  at  a  control 
station  throughout  a  specified  time  interval.  The  relationships  between 
stations  are  compared  and  related  to  land  use  and  other  factors  affecting 
the  system. 

Consider  the  situation  illustrated  in  Figure  10  where  there  are  two 
basins,  one  being  intensely  managed  for  timber  production  (Clark  Fork)  and 
the  other  used  for  a  control  (Lewis  Fork).  The  hydrologist  wishes  to 
determine  the  cumulative  impacts  of  timber  harvesting  on  the  annual 
suspended  sediment  regime  of  Clark  Fork  for  the  1977-80  water  years. 


15 


0 


CLARK  FORK 


LEWIS   FORK 


Figure  10.     Nested  station  design  for  assessment  of  cumulative  impacts. 
Station  F  is  the  control. 


J 


Suspended  sediment  samples  were  collected,  paired-in-time, 
throughout  the  4  water  years  of  interest.     At  the  end  of  each  water  year, 
the  paired  plots  were  developed  (Figure  11)  and  the  regression  coefficients 
tabulated  along  with  a  series  of  stream,  vegetative,  and  landform 
characteristics  (see  Table  3).     The  regression  relationships  between 
stations  can  then  be  compared  and  related  to  the  stream,  vegetative,  and 
landform  characteristics,   and  cumulative  impacts  may  then  be  evaluated. 


16 


ss, 


ss, 


SSE 


78 


so 


77 

79 


SS, 


SSr 


»S. 


SS£ 


80  79 


SS, 


Figure  11.     Suspended  Sediment  (SS)  plots  for  stations  located  in  the 
Clark  Fork  basin  (A,  B,  C,  D,  and  E)   versus  the  control 
(station  F)   located  in  the  Lewis  Fork  basinc 


In  this  example,   it  appears  in  Figure  11  that  the  suspended-sediment 
yield  at  station  E  is  increasing  in  relation  to  the  control    station.     It  is 
also  apparent  that  the  primary  source  of  this  increase  is  watershed  C.     The 
paired-basin,  suspended-sediment  regressions  of  stations  A,  B,  and  D  with 
station  F  remain  fairly  consistent  with  time  (no  significant  difference  at 
the  0.10  level)  while  station  C  is  increasing  in  relation  to  station  F 
(significantly  different  at  the  0.10  level).      Examination  of  Table  3 
provides  some  insight  to  the  source  of  the  problems.     The  b..    regression 
coefficients  (slope),  percentage  of  the  basin  harvested,  and  area  affected 
by  roads  increase  with  time  while  the  channel -stability  condition 
decreases,  pool -and-rif fie  quality  decrease,   and  condition  of  the  riparian 
vegetation  has  degraded  with  time.     This  indicated  that  the  silvicultural 


17 


Table  3.     Regression  Coefficients  and  Land-Use  Characteristics  for  Basin  C 


CO 


bin 

WATER 
YLAR 

in  Qi>r:,s  inti 

STREAM 

VEGETATIVE    CHARACTERISTICS 

IANOFOIIM    CIIARACII  RISIICS 

COEFT ICICNTS 

CHARACTERISTICS 

CRVd 

PERCENT    OT    AREA    HARVESTED 

mil!  ? a r 

TYPE 

ROADS    (CROSS    DIN)' 

MASS    WASTINi;    (AC)  (HI) 

V 

°i 

CSR* 

pqb 

ftgc 

SKYe 

m' 

IRAC9 

TOTAL 

RIPARIAN 

HIDSIOPE 

Rinr.n  mi 

ACTIVE 

nun  ha  ill 

;  i  .1 
[    i 

_35L_ 

■:-'>\i 

C-Vv2 

'.-^ 

£3 

p.p 

c- 

c 

iCj 

vc. 

ic- 

OF 

5,o 

l«5 

£,V_. 

•T  I 

__    X: 

£.J_lL 

0.  t  i  i 

CI 

j£ 

a» 

Gv 

o 

'Lb 

ic 

"/,c 

Ov 

*S .  c 

C  tit 

et 

w 

'ife 

G/P 

d 

?_:> 

ic. 

:vr 

f)f 

S".C 

u.r 

'].  *i 

L      C"' 

V 

,'■'  •"■ 

r..X\-j 

c  ia 

'Vc 

. — i 

u 

Gr/P 

3 

L-J 

Ifc 

3fc 

Of 

:V  C 

U.L 

IC  0 

( ". 

1    • 
K    ■ 




. 

















... . 

Channel  Stability  Rating 


Pool/Riffle  ratio 


'Pool  Quality 


Skyline  Yarding 
9High  Lead  Yarding 
Tractor  Yarding 


Riffle  Quality 


'Condition  of  the  Riparian  Vegetation 


Number  of  stream  crossings 
x  miles  of  road 


^J 


and  road  construction  and  maintenance  practices  are  affecting  suspended- 
sediment  yield  from  this  watershed  and  that  part  of  this  increased  yield  is 
being  transported  out  of  the  basin.  If  this  response  is  having  a 
detrimental  effect  on  a  specified  beneficial  use,  then  this  information  can 
be  used  to  adjust  soil  and  water  conservation  practices  for  watershed  C. 

CONCLUSIONS 

We  believe  the  paired-basin  technique,  if  used  properly,   is  an 
effective  tool   for  analyzing  water-quality  data  from  upland  streams.     In 
some  situations,  the  technique  provides  for  greater  statistical   control 
(minimizes  the  unexplained  variation)  and  enables  the  watershed  specialist 
to  maximize  information  gained  while  minimizing  time,  manpower,  and 
economic  expenditures. 

As  with  any  statistical   tool,  the  paired-basin  technique  will   only 
provide  you  with  "yes"  and  "no"  answers.     The  regression  relations  will 
only  provide  you  with  insight  to  the  hydrologic  system  and  water  quality 
response.     Data  interpretation  is  an  intellectual   activity  requiring  all 
the  skills  of  a  professional   wildland  hydrologist. 

ACKNOWLEDGEMENTS 
The  authors  would  like  to  acknowledge  the  numerous  individuals   in  the 
Forest  Service,  U.S.  Geological    Survey,  and  the  Bureau  of  Land  Management 
who  reviewed  this  report.     A  very  special    thanks  is  extended  to 
Dr.   Robert  Thomas,  Mathematical    Statistician,  of  the  Pacific  Southwest 
Forest  Experiment  Station  for  his  detailed  review  and  helpful   comments. 


19 


LITERATURE  CITED 

Averett,   R.   C. ,   S.   L.   Ponce,  and  D.   W.   Schindler.     1981.     A  Report  on: 

Bull    Run  Watershed  Water-Quality  Management.     Bureau  of  Water  Works, 

City  of  Portland,   Oregon,  1800  S.   W.   Sixth  Avenue,   Portland,  Oregon  ^  ) 

97201.     53  p. 

BMDP.  1981.  BMDP  Statistical  Software  1981  Edition.  University  of 
California  Press,  LTD.  Berkeley,  California.  725  p. 

Bates,  C.  G.,  and  A.  J.  Henry.  1928.  Forest  and  Streamflow  Experiment 
at  Wagon  Wheel  Gap,  Colorado.  Final  report  upon  completion  of  the 
second  phase  of  the  experiment.  U.S.D.A.  Weather  Bureau,  Monthly 
Weather  Review,  Supplement  No.  30.  79  p. 

Beschta,  R.  L.  1981.  Sediment  and  Bedload  Sampling  Study  Design  and 

Analysis.  Lecture  presented  at  the  R-l  Soil  and  Hydrology  Workshop. 
April  27-30,  Missoula,  Montana. 

Beschta,  R.  L.,  S.  J.  O'Leary,  R.  E.  Edwards,  and  K.  0.  Knoop.  1981. 
Sediment  and  Organic  Matter  Transport  in  Oregon  Coast  Range  Streams. 
Water  Resources  Research  Institute.  OSU  Report  WRRI-70.  Oregon  State 
University,  Corvallis,  Oregon  97331.  67  p. 

Brown,  G.  W.,  and  J.  T.  Krygier.  1971.  Clearcut  Logging  and  Sediment 
Production  in  the  Oregon  Coast  Range.  Water  Resources  Research, 
7(5):1189-1199. 

Elashoff,  J.  C.  1969.  Analysis  of  Covariance:  A  Delicate 

Instrument.  American  Educational  Research  Journal ,  6:383-401. 

Glass,  G.  V.,  P.  D.  Peckham,  and  J.  R.  Sanders.  1972.  Consequences  of 
Failure  to  Meet  Assumptions  Underlying  the  Fixed  Effects  of  Variance 
and  Covariance.  Rev.  Educ.  Res.,  42:237-288. 

Kovner,  J.  L.  1968.  Calibration  of  Paired  Watersheds.  Watershed 

Management  Research:  Semiannual  Report.  Rocky  Mountain  Forest  and 
Range  Experiment  Station,  April -September,  p.  RM1-RM4. 

Kovner,  J.  L.,  and  T.  C.  Evans.  1954.  A  Method  for  Determining  the 
Minimum  Duration  of  Watershed  Experiments.  Transactions,  American 
Geophysical  Union,  35(4):608-612. 

Reinhart,  K.  G.  1967.  ..Watershed  Calibration  Methods.  Presented  in 
Proceedings  of  a  International  Symposium  on  Forest  Hydrology.  Ed. 
by  W.  Sopper  and  H.  Lull.  Pennsylvania  State  University,  p.  715- 
723. 

Schindler,  D.  W.,  R.  W.  Newburg,  K.  G.  Beaty,  J.  Prokopowich, 

T.  Ruszczynski,  and  J.  A.  Do! ton.  1980.  Effects  of  a  Windstorm 
and  Forest  Fire  on  Chemical  Losses  from  Forested  Watersheds 
and  on  the  Quality  of  Receiving  Streams.  Can.  J.  Fish.  Aquat. 
Sci.,  37(3):328-334. 


20 


Singh,  T. ,  and  Y.  P.  Kalra.  1972.  Water  Quality  of  an  Experimental 
Watershed  During  the  Calibration  Period*  Paper  presented  at  the 
19th  annual  meeting  of  the  Pacific  Northwest  Region.  American 
Geophysical  Union,  Vancouver,  B.C.  (October  16-17). 

Thut,  R.  N.  and  E.  P.  Haydu.  1970.  Effects  of  Forest  Chemicals  on 
Aquatic  Life.  Proceedings  of  a  symposium:  Forest  Land  Uses  and 
Stream  Environment.  Oregon  State  University.  Corvallis,  Oregon 
(October  19-21).  p.  159-171. 

W1ldt,  A.  R. ,  and  0.  Ahtola.  1978.  Analysis  of  Covariance.  Sage 
Publications,  Inc.   Beverly  Hills,  Calif.  90212.  93  p. 

Wilms  H.  G.  1949.  How  Long  Should  Experimental  Watersheds  be 

Calibrated?  Transactions,  American  Geophysical  Union,  30(2): 
272-278. 

Wilson,  A.  L.  1978.  When  is  the  Chow  Test  UMP?  The  American 
Statistician.  Vol.  32.,  No.  2,  p.  66-68. 


21 


ESTIMATING  SOIL  EROSION  USING  AN  EROSION  BRIDGE 


BY 

Darlene  G.  Blaney 
Statistician 

Gordon  E.  Warrington 
Soil  Scientist 


WSDG  Report 

WSDG-TP-00008 

August  1983 


USDA  Forest  Service 

Watershed  Systems  Development  Group 

38^b  East  Mulberry  Street 

Fort  Collins,  Colorado  80524 


TABLE  OF  CONTENTS 

Page 

INTRODUCTION  1 

MEASURING  SOIL  EROSION  WITH  AN  EROSION  BRIDGE  2 

The  Project  Area  2 

Selecting  the  Primary  Sampling  Units  3 

Equipment  7 

Recording  Field  Data  8 

STATISTICAL  ANALYSIS  11 

Definition  of  Terms  11 

Confidence  Intervals  20 

Determining  the  Sample  Size  26 

COMMENTS  30 

REFERENCES  33 

APPENDIX  A:  Some  Statistical  Tables  34 

APPENDIX  B:  The  3-F  Erosion  Bridge— A  New  Tool  for  Measuring  43 
Soil  Erosion 

APPENDIX  C:  Erosion  Bridge  Construction  51 

APPENDIX  D:  Stratified  Sampling  54 


LIST  OF  EXAMPLES 

Page 
Example  1.    Selecting  the  primary  sampling  units.  4 

Example  2.    Selecting  the  angle  associated  with  each  primary       6 
sampling  unit. 

Example  3.    Statistical  computations  with  some  erosion  14 

bridge  data. 

Example  4.    Finding  a  value  in  the  t-tables.  21 

Example  5.    Putting  a  confidence  interval  on  the  erosion         22 
rate. 

Example  6.    Putting  a  confidence  interval  on  the  difference       24 
between  two  erosion  rates. 

Example  7.    Determining  the  sample  size  using  the  sample         28 
variance  from  a  previous  study. 


LIST  OF  FIGURES 


Figure  1.    A  grid  placed  at  a  9  angle  over  a  map 
of  the  project  area. 

Figure  2.  Field  data  sheet. 

Figure  3.  Measuring  differences  in  the  soil  level  over  time. 

Figure  C.l.  Erosion  bridge. 

Figure  C.2.  Measuring  rod. 


Page 

5 

9 
10 

5?. 
53 


LIST  OF  TABLES 

Page 

Table  1.     Measurements  at  time  t.  15 

Table  2.     Measurements  at  time  t  +  At.  16 

Table  3.     Differences  in  the  soil  level  between  time  t  and       17 
time  t  +  At. 

Table  A.l.    Ten  thousand  random  digits.  35 

Table  A. 2.    Critical  values  for  student's  t-distribution  40 


ABSTRACT 


An  easy  to  use,  field  based  procedure  for  estimating  soil  erosion 
caused  by  soil  disturbances  in  mountainous  areas  is  discussed  in  detail. 
Measurements  are  taken  two  or  more  times  a  season  using  an  inexpensive 
erosion  bridge,  a  four  foot  aluminum  masonry  level  placed  on  two  fixed 
support  pins.  A  sampling  unit  consits  of  two  erosion  bridges  placed 
end-to-end;  the  distance  to  the  soil  surface  is  measured  at  10  fixed 
points  along  each  bridge.  An  unbiased  estimator  of  the  surface  change  is 
obtained  by  an  initial  random  selection  of  both  sampling  units  and 
erosion  bridge  orientation.  These  data  are  used  to  calculate  the  average 
change  in  the  soil  surface  elevation.  Examples  show  how  to  use 
t-stati sties  to  calculate  a  confidence  interval  for  the  mean,  compare  a 
measured  mean  with  a  standard  value,  or  compare  the  values  of  two 
measured  means.  Resulting  information  can  be  used  to  evaluate  cause  and 
effect  relationships  between  management  practices  and  soil  erosion  as 
well  as  soil  erosion  effects  on  vegetative  productivity  and  sediment 
production. 


INTRODUCTION 

The  National  Forest  Management  Act  (P.L.  94-588)  requires  the  Forest 
Service,  among  other  things,  to  monitor  the  effects  of  its  management 
practices  to  ensure  that  there  will  be  no  substantial  or  permanent 
impairment  of  land  productivity.  The  regulations  resulting  from  this  act 
require  documentation  of  these  effects  and  "a  quantitative  estimate  of 
performance  comparing  outputs  ...  with  those  projected  by  the  Forest 
Plan"  (36  CFR  219.12kl).  An  output  is  defined  as  "a  good,  service,  or 
on-site  use  produced  from  forest  and  rangeland  resources"  (FSM  1970.5). 

Land  management  practices,  as  well  as  natural  events,  may  accelerate 
soil  erosion,  affecting  vegetative  productivity  and  water  quality. 
Through  monitoring,  data  can  be  collected  and  analysed  to  provide  insight 
into  the  magnitude  and  timing  of  soil  loss  for  a  particular  set  of  soil 
characteristics,  site  conditions,  and  management  practices.  You  can  use 
this  information  to  document  the  effects  of  soil  erosion  on  land 
productivity  and  to  adjust  output  projections  for  forest  planning. 

This  report  presents  a  quantitative  procedure  for  measuring  elevation 
changes  of  the  mineral  soil  surface  caused  by  sheet  and  rill  erosion. 
The  technique  can  be  easily  used  by  forest  watershed  specialists  and 
tecnnicians  as  part  of  a  soil  erosion  monitoring  program.  A  sampling 
design  is  presented  along  with  details  of  establishing  sample  points, 
using  the  equipment,  recording  field  data,  and  a  statistical  method  for 
data  analysis. 


MEASURING  SOIL  EROSION  WITH  AN  EROSION  BRIDGE 

You  can  determine  soil  loss  or  accumulation  trends  occurring  after  a 
management  practice  by  taking  measurements  of  the  mineral  soil  surface 
elevation  two  or  more  times  a  season  with  an  erosion  bridge.  If  there  is 
soil  loss,  the  initial  erosion  rates  are  expected  to  be  high  and  then  to 
taper  off  to  some  natural  base  rate  (Megahan  1974;  Megahan  and  Kidd  1972). 
Establish  elevation  reference  points  by  driving  a  pair  of  steel  stakes 
into  the  ground  at  each  sampling  location.  Use  the  horizontal  bubble  of 
an  erosion  bridge  to  level  the  two  stakes.  Place  an  erosion  bridge  on 
the  steel  stakes  and  measure  the  distance  a  metal  rod  protrudes  above  the 
bridge  at  each  of  ten  holes  along  the  bridge.  Calculate  the  amount  of 
soil  loss  (or  deposition)  from  these  measurements  and  make  notes  about 
the  soil  surface  conditions  to  aid  in  validating  data  obtained  from  the 
bridge. 

The  Project  Area 

The  project  area  is  the  population  from  wnich  samples  are  to  be  taken. 
A  project  area  will  fall  in  one  of  two  categories. 

1.  A  site  on  which  management  activity  has  taken  place — a  timber 
sale,  for  example. 

2.  A  site  on  which  management  activity  will  take  place.  That  is,  a 
site  will  be  monitored  before  a  management  activity  is 
implemented  in  order  to  establish  the  average  rate  of  erosion  for 
the  area.  It  will  again  be  monitored  after  the  activity  to 
determine  if  accelerated  erosion  occurs. 


f 


Once  a  project  area  has  been  selected,  you  want  to  select  sampling 
units  which  are  representative  of  the  area.  Random  sampling  is  one  tool 
which  can  aid  you  in  this  endeavor.  In  addition,  random  selection 
forestalls  the  criticism  that  your  sampling  was  biased  due  to  your 
preconceived  ideas  (Hill  1962,  p.  10). 

Throughout  this  report,  the  term  "primary  sampling  unit"  (psu)  refers 
to  a  specific  location  within  the  project  area  spanned  by  two  erosion 
bridges  placed  end-to-end.  At  each  psu,  20  measurements  will  be  taken. 
Each  of  these  20  measurements  is  referred  to  as  a  "secondary  sampling 
unit"  (ssu).  Thus  a  complete  set  of  data  consists  of  n  primary  sampling 
units  with  20  secondary  sampling  units  taken  at  each  primary  sampling 
unit. 

Selecting  the  Primary  Sampling  Units 

For  the  present  discussion  assume  that  the  sample  size  n  has  already 
been  determined.  A  method  for  deciding  how  large  of  a  sample  to  take  is 
covered  in  the  section  Determining  the  Sample  Size. 

To  randomly  select  the  n  psu's  on  the  project  area,  proceed  as 
follows.  Consult  a  random  number  table  (see  Appendix  A,  Table  A.l)  and 
select  any  three  columns  and  any  row.  Beginning  there  use  the  first 
number  between  1  and  180,  inclusive.  This  number  represents  the  angle  you 
will  use  to  orient  a  predetermined  systematic  grid  over  the  project  area 
(magnetic  north  is  0  degrees)  (Howes,  Hazard,  and  Geist  1981).  After  the 
grid  has  been  randomly  oriented,  there  should  be  N  >  n  grid  intersection 
points  on  the  project  area.  Number  tnese  N  points  1,  2,  .  .  .,  N, 


beginning  at  the  uppermost  left  corner  and  proceeding  in  a  serpentine 
pattern.  Finally,  select  the  n  psu's  from  these  N  points  by  again  using 
a  random  number  table. 

Suppose  N  =  24  and  n  =  9.  Refer  to  a  random  number  table  and  select 
any  two  columns  and  any  row  as  a  starting  place.  Proceed  down  these 
columns  and  select  the  first  nine  distinct  numbers  between  1  and  24  which 
occur  in  the  table.  If  you  do  not  find  nine  different  numbers  by  the  end 
of  the  page,  select  a  new  starting  point  and  continue  until  all  nine  psu's 
have  been  obtained.  Choose  a  new  starting  place  each  time  the  random 
number  table  is  utilized.  Consider  the  following  example. 


Example  1.  Selecting  the  primary  sampling  units. 

Suppose  you  have  a  map  of  a  project  area  from  which  you  need 
to  select  nine  primary  sampling  units.  Going  to  Table  A.l,  page 
36,  row  16,  columns  14-16,  proceed  down  the  columns  until  you 
come  to  the  first  number  that  is  between  1  and  180,  in  this 
case,  y.  After  orienting  the  grid  at  9  degrees,  it  appears  as 
in  Figure  1. 

There  are  N  =  24  grid  points  lying  on  the  project  area.  To 
select  the  nine  psu's  from  these  points,  you  again  turn  to  Table 
A.l.  This  time  you  decide  to  start  in  columns  35  and  36,  row  10 
on  page  35.  Moving  down  the  columns  you  find  the  following 
psu's:  4,  17,  12,  22,  6,  and  13.  Continuing  in  columns  39  and 
40,  row  1,  you  have:  19,  9,  and  11. 


Figure  1.  A  grid  placed  at  a  9  angle  over  a  map  of  the  project  area. 


The  order  in  which  these  nine  psu's  are  accessed  in  the  field  is  not 
statistically  important.  Therefore  select  a  route  that  will  minimize  time 
spent  traveling  between  units,  say  4,  6,  9,  11,  12,  13,  17,  19,  and  22. 

In  the  field  each  psu  is  randomly  oriented  with  an  angle  between  1  and 
180  degrees,  with  magnetic  north  at  0  degrees  (DeVries  1974;  Pickford  and 
Hazard  1978).  In  order  to  obtain  the  random  angles  for  the  n  psu's,  go 
to  a  random  number  table  and  choose  any  three  columns  and  any  row.  Select 
the  first  n  numbers  that  are  between  1  and  180  inclusive. 


Example  2.  Selecting  the  angle  associated  with 
each  primary  sampling  unit. 


In  Example  1  you  have  nine  psu's  and  you  need  to  select  the 
angle  associated  with  each.  Going  to  Table  A.l,  page  35,  and 
using  columns  9-11  beginning  in  row  1,  you  obtain  the  following 
angles:  116,  82,  38,  123,  and  177.  Continuing  in  columns  13-15, 
the  first  row  you  get:  54,  32,  86,  and  164.  Hence  the  first  psu 
placed  in  the  field  will  have  an  angle  of  116  degrees,  the  second 
one  will  have  an  angle  of  82  degrees,  and  similarly  for  the 
remaining  7  psu's. 


Because  you  are  interested  in  measuring  soil  erosion  over  the  site, 
such  places  as  rock  outcrops,  which  have  no  appreciable  amounts  of  soil, 
will  not  be  sampled.  That  is,  the  site  is  actually  divided  into  two 
classes — soil  and  other.  You  are  sampling  from  the  first  class,  soil.  If 


7 


you  want  to  give  the  results  in  terms  of  mass  of  eroded  material  from  the 
project  area,  you  will  need  to  know  the  proportion  of  the  project  area 
which  is  soil. 

It  is  likely  that  you  will  encounter  obstacles  in  the  field  tnat  will 
impede  the  measurements  of  some  of  the  ssu's  associated  with  a  given  psu. 
If  the  number  of  missing  ssu's  is  serious,  you  may  either  select  a  new  psu 
or  increase  the  total  number  of  psu's  in  the  sample.  Both  methods  indicate 
a  need  to  select  alternate  sampling  units  from  the  grid  overlay.  To  save 
time,  select  units  before  field  work.  Because  of  the  need  for  alternate 
sampling  units,  we  suggest  that  the  grid  size  be  such  that  N  (the  number 
of  grid  intersection  points  on  the  project  area)  be  at  least  twice  as 
large  as  n. 

Equipment 

To  measure  the  changes  in  soil  level,  use  the  erosion  bridge  developed 
by  Ranger  and  Frank  (1978).  (For  your  convenience  we  have  reproduced 
their  paper  and  included  it  in  Appendix  B;  schematic  diagrams  from  the 
paper  have  been  drawn  and  are  included  in  Appendix  C.)  The  erosion 
bridge  is  a  48-inch  or  longer  masonry  level  that  is  placed  on  two  support 
pins.  Once  the  tops  of  the  two  pins  have  been  leveled,  changes  in  the 
pins  can  be  monitored  with  the  horizontal  bubble  of  the  level.  The  pins 
remain  in  the  ground  throughout  the  study.  The  masonry  level  is  carried 
to  and  from  the  field  and  used  on  all  the  support  pins.  Each  primary 
sampling  unit  consists  of  two  erosion  bridges  placed  end-to-end  along  a 
straight  line,  thus  only  three  support  pins  will  be  necessary  as  the 
center  pin  will  be  used  with  both  bridges.  We  suggest  that  a  brightly 
marked  post  be  placed  in  the  ground  near  the  pins  to  make  them  easier  to 
find. 


8 


The  level  has  10  equally  spaced  holes  drilled  in  the  upper  and  lower 
flanges.  A  two-foot  metal  rod  is  placed  through  each  hole  to  the  mineral 
soil  surface,  and  a  measurement  is  taken  of  the  distance  the  rod  protrudes 
above  the  level.  Thus  there  will  be  10  measurements  per  bridge.  You  can 
save  time  in  the  field  and  increase  the  accuracy  of  the  measurements  by 
using  a  metal  rod  scribed  in  tenths  of  inches  or  millimeters. 

Recording  Field  Data 

Figure  2  is  a  data  sheet  for  field  use.  Each  time  field  measurements 
are  taken  the  number  of  the  psu's  as  well  as  the  name  of  the  person  taking 
the  measurements  should  be  recorded.  For  convenience,  we  suggest, 
renumbering  the  n  psu's  selected  from  one  to  n.  The  original  number  can 
be  put  in  parentheses  next  to  its  new  number.  Any  comments  about 
vegetation,  slope,  aspect,  disturbances,  or  unusual  characteristics  of  a 
particular  psu  can  be  written  under  remarks  or  on  the  back  of  the  sheet. 

There  are  20  ssu's  for  each  psu.  These  ssu's  are  recorded  under 
columns  1  through  20.  Always  record  the  ssu's  in  the  same  order  and  in 
the  same  columns  each  time  you  take  measurements.  Determine  the  amount  of 
change  that  has  occurred  in  the  soil  level  for  a  given  ssu  by  subtracting 
the  measurements  taken  at  time  t  from  those  taken  at  time  t  +  At. 

Figure  3  illustrates  this  point.  Between  time  t  and  time  t  +  At,  a 
total  of  A  inches  of  soil  is  deposited  at  ssu(l).  This  same  amount  is 
recorded  as  A',  where  A'  is  the  difference  in  the  height  of  the  measuring 
rod  between  times  t  and  t  +  At.  Thus,  A  =  A'.  Similarly,  8  inches  of 
soil  are  degraded  at  ssu(2).  This  amount  is  recorded  as  B',  and  B  =  B1. 


EROSION   BRIDGE  EIELD  DATA 


Study  Area: 
Techn let  an: 


Date: 


Measurement  units: 


PSIJ'B        18  3  4 


HSU'S 
U«ft   to   right   tmotnf)  upalopa) 

s       e       ?       a       a       la      is      12      is      H      is      ib      i?      te      is      20 


. 

1 — 

I Zj LI  1  1  I  JJ-Xj_ 


Remarks ; 


\o 


Figure  2.  Field  data  sheet. 


10 


• 


Height  at: 
time  t+At  -j* 

A' 
time  t 


Measuring  rod  ■ 


r-»""-T- 
i  l 

rvi 


Height  at: 
time  t 


time  t  +  At 


fl 


SSU  (2) 

Degradation 


y 


Level 


-Rebar  stake 


Old  soil  surface 
v/(timet) 


New  soil  surface 
(time  t  + At) 


I    I 

LJ 


Figure  3.  Measuring  differences  in  the  soil  level  over  time. 


II 

STATISTICAL  ANALYSIS 

The  following  analysis  is  based  on  the  assumption  that  changes  in  the 
soil  elevation  are  homogeneous  over  the  project  area-  If  the  changes  are 

localized,  then  the  estimator  developed  in  the  Definition  of  Terms  section 

1/ 
will  be  inefficient.—   If  you  suspect  that  the  changes  in  the  soil 

elevation  are  not  going  to  be  similar,  stratify  the  project  area  into 

areas  that  are  homogeneous  with  respect  to  erosion.  For  a  discussion  on 

stratified  sampling  see  Appendix  D. 

Definition  of  Terms 

In  order  to  calculate  the  mean  change  in  the  soil  level,  develop  the 
notation  given  below.  Let 

A   »   total  size  of  the  project  area  (acres); 

P   =   proportion  of  the  area  that  is  soil; 


3 
p   =   bulk  density  of  the  soil  (g/cm  ); 


number  of  primary  sampling  units  (erosion  bridge  locations); 


m.  ■   number  of  ssu's  taken  at  the  ith  psu  (20  if  no  data  are  missing); 


At  =    time  elapsed  between  time  t  and  t  +  At  (usually  expressed  in 
years); 


x..(t)      -   value  of  the  jth  ssu  of  the  ith  psu  at  time  t; 


1/ 

-John  Hazard.  1981.  Station  Statistician,  Pacific  Northwest  Forest 

and  Range  Experiment  Station,  USDA  Forest  Service,  Portland,  Oregon. 

(Personal  Communication). 


12 


x-  -(t  +  At)   =   value  of  the  jth  ssu  of  the  ith  psu  at  time  t  +  At; 

d..  =  x..(t  +  At)  -  x,,(t)  =  x.-Ut)  =  change  in  the  soil  level  during 
1J    1J  1J        J      time  At  at  the  jth  ssu  of  the  ith 

psu; 


«1 


d.  =  £  d..  =   total  change  in  the  soil  level  of  the  ssu's  for  the 
1    j=l  1J     ith  psu; 


m. 


d-  =  (1/m.)  *  z  d- •  =  estimator  of  the  mean  change  in  the 
i_l  1J     level  of  the  ith  psu  during  time  At 


and 


n 


d  =  (1/n)  *  Z     6-  =  estimator  of  the  mean  change  in  the  soil 

■j_l        level  during  time  At. 

We  make  the  following  assumptions  about  d.  .. 

d..  =  D .  +  e^    (i  =  1,  2 n;  J  -  1,  2,  ....  m^) 


1. 


2.  The  value  of  D.  is  the  unknown  mean  of  the  ith  psu.  The  0-'s  are 
independent  and  identically  distributed  random  variables  from  an 
infinite  population  with  unknown  mean  D  and  unknown  variance  c,. 

3.  The  e--'s  are  independent  and  identically  distributed  random 

variables  from  an  infinite  population  with  mean  0  and  unknown 

2 
variance  a2» 

4.  The  e-.'s  and  D. 's  are  pairwise  uncorrected. 

The  term  d  is  an  unbiased  estimator  of,  D,  the  mean  change  in  the  soil 
level  during  time  At.  The  estimator  of  the  variance  of  d  is 


2  -    ill   ^i  -  ^   1-1  d?  -  n  *  d2 

S2(d)  =  J£± ] 1-1  1   [1] 

n  *  (n  -  1)     n  *  (n  -  1) 


13 


Multiplying  d  by  a  constant,  c,  yields  an  unbiased  estimator  of  c  *  D. 
The  variance  of  this  estimator  is 

S2(c  *  d)  =  c2  *  S2(d). 

For  example,  if  you  want  to  express  the  change  in  the  soil  level  in  tons 
per  acre  per  year,  r,  and  the  measurement  units  are  inches,  let 
c  =  113.31  *  p/At  where  At  is  expressed  in  years.  The  value  113.31  is 
evaluated  by 

,n  ,i   (3630  ft3/acre  inch)  *  (28,338  cm3/ft3)  *  (2.203  lb/kg) 
iijji  = (1,000  g/kg)  *  12,000  lb/I) 

This  gives  us 

r  =  (113.31  *  p/At)  *  d  [2] 

and 

S2(r)  =  (113.31  *  P/At)2  *  S2(d)  [3] 

If  you  want  to  express  your  results  in  terms  of  tons  of  eroded  material,  Ms 
of  the  project  area,  you  let  c  =  113.31  *  P  *  A  *  p.  Thus 

M  =  (113.31  *  P  *  A  *  P)  *  d  [4] 

and 

S2(M)  =  (113.31  *  P  *  A  *  P)2  *  S2(d)  [5] 


14 


* 


Example  3.  Statistical  computations  with  some  erosion  bridge  data. 

Suppose  you  placed  the  nine  psu's  from  Example  2  in  the 
field  and  took  two  sets  of  measurements,  one  on  February  17, 
1981,  and  one  on  May  16,  1981.  These  are  recorded  in  Tables  1 
and  2,  respectively.  Hence  you  have  a  time  interval  of  88  days 
(0.24  yr.)   between  measurements.  Table  3  contains  the 
differences  of  the  first  set  of  measurements  subtracted  from  the 
second,  by  its  ssu  for  each  psu.  For  example,  ssu  (1)  of  the 
first  psu  showed  a  change  in  the  soil  level  of 

7.85  inches  -  8.50  inches  =  -.65  inches..?/ 
Similarly,  the  fifth  ssu  of  the  third  psu  indicated  a  loss  of 

4.30  inches  -  6.05  inches  =  -1.75  inches..?./ 

Notice  that  in  Table  2  a  measurement  has  not  been  recorded 
for  the  fifth  ssu  of  the  fourth  psu.  Under  "Remarks"  you  find 
the  reason  for  this:  new  vegetation  now  obstructs  the  pathway 
of  the  pin  to  the  ground.  Thus  Table  3  also  has  a  missing  data 
point  for  the  same  ssu  and  psu.  This  means  that  m^  the  number 
of  measurements  recorded  for  this  psu,  is  19.  This  information 
is  recorded  in  Table  3  under  the  column  "m-." 


2/ 

—  A  negative  number  means  soil  loss.  Please  keep  in  mind  that 

measurements  are  from  the  top  of  the  level  to  the  top  of  the  measuring 

rod.  Therefore,  soil  loss  will  result  in  second-measurement  numbers  that 

are  smaller  than  first-measurement  numbers. 


EROSION   BRIDGE    FIELD   DRTR 


Study    flrea:      Example  3 
Technician:    ________ 


Date:       February  17,   19B1 


Measurement    units:     Inches 


PSU'S    i 


SSU'S 
(left  *q  rtght  facing  up* lope) 
?    B    3    IB   tl    12    13    14    IS   16   J?    18   19   28 


1  (4.^ 

8.50 

8.45 

8.50 

8.30 

9.40 

9.60 

9.65 

8.85 

9.85 

9.60 

8.60 

8.70 

8.60 

9.65 

9.25 

9.75 

8.50 

8.60 

8.35 

8.35 

2  (6) 

7.90 

6.75 

7.55 

8.00 

8.50 

6.75 

9.40 

9.95 

10.10 

10.50 

10.45 

10.45 

8.35 

9.40 

9.35 

9.45 

10.10 

9.35 

7.25 

7.15 

3  (9) 

8.00 

7.80 

7.60 

6.80 

6.05 

6.60 

6.75 

5.70 

6.95 

7.00 

7.70 

7.85 

6.90 

6.85 

6.60 

6.75 

7.15 

7.45 

5.85 

6.40 

4  (11) 

8.05 

8.40 

8.00 

8.50 

9.00 

8.70 

9.05 

9.10 

8.50 

8.20 

8.50 

9.30 

8.90 

8.20 

8.10 

8.90 

8.15 

8.75 

9.00 

9.10 

5   (12) 

8.45 

8.95 

9.25 

8.00 

8.40 

8.45 

8.75 

7.95 

9.40 

9.15 

9.30 

8.75 

8.10 

9.15 

8.30 

8.70 

8.25 

8.25 

8.95 

8.00 

6  (13) 

4.50 

4.35 

3.95 

5.55 

7.85 

8.30 

8.75 

9.00 

8.00 

6.60 

7.45 

7.50 

6.25 

5.25 

8.25 

6.40 

5.00 

5.95 

6.50 

8.30 

7  (17) 

9.00 

7.70 

7.75 

7.50 

7.50 

8.15 

8.40 

8.95 

9.10 

8.55 

8.60 

8.40 

7.75 

8.35 

8.25 

8.65 

8.65 

8.75 

9.00 

8.50 

8  (19) 

8.60 

8.30 

8.10 

7.40 

7.10 

7.00 

6.45 

7.10 

6.10 

6.50 

8.00 

8.35 

7.35 

7.50 

7.25 

6.35 

8.25 

8.45 

7.90 

8.30 

9  (22) 

6.00 

6.20 

6.60 

6.65 

6.70 

6.70 

6.90 

6.65 

7.50 

7.55 

7.35 

6.85 

6.95 

6.50 

6.75 

6.50 

6.30 

6.10 

6.05 

6.30 

_■—-.._. 

._-____ 

,,  . 

Remarks : 


3/ 


The  numbers  1n  parentheses  are  the  original  psu  numbers. 


cn 


Table  1.     Measurements  at  time  t. 


EROSION   BRIDGE   EIELD   DflTR 


Study    Rrea:        Exmp]e  3 
Technician:    


Date : 


May  16,    1981 


Measurement    units: 


Inches 


SSU'S 

(left   to   right   facing   upslops) 
PSU'S         I  2  3  4  3  6  7  8  9  10         II         12         13         1-1         15         IB         17         18         19        20 


1  (4)a 

7.85 

7.60 

7.65 

8.10 

9.00 

8.85 

8.50 

8.70 

9.45 

9.50 

8.50 

8.50 

7.80 

8.60 

8.45 

8.75 

7.70 

8.20 

8.95 

8.20 

2  (6) 

7.60 

6.70 

7.40 

7.25 

7.75 

8.60 

9.15 

9.20 

10.10 

10.50 

9.85 

9.80 

9.45 

8.85 

9.20 

8.45 

7.95 

10.25 

7.55 

7.85 

3  (9) 

7.90 

7.20 

5.45 

4.75 

4.30 

3.70 

5.80 

5.80 

6.45 

6.85 

4.70 

5.95 

6.10 

5.25 

6.90 

7.40 

5.60 

4.75 

7.85 

7.20 

4  (11) 

8.50 

7.95 

7.25 

7.85 

8.30 

8.40 

9.10 

8.50 

8.20 

8.50 

7.40 

8.20 

8.95 

8.45 

8.25 

8.95 

8.05 

7.70 

7.45 

5   (12) 

8.45 

8.90 

8.00 

7.10 

7.95 

7.80 

8.35 

7.70 

9.05 

9.25 

8.70 

8.85 

8.35 

8.55 

8.80 

8.55 

7.75 

7.85 

8.85 

8.00 

6   (13) 

4.60 

4.25 

4.15 

5.30 

6.80 

8.20 

8.20 

8.50 

6.05 

5.40 

5.10 

4.30 

5.25 

7.00 

5.35 

6.50 

8.30 

5.55 

5.85 

8.50 

7  (17) 

8.46 

8.00 

7.70 

6.85 

7.65 

7.90 

8.35 

8.60 

9.00 

8.55 

6.95 

7.25 

7.25 

6.95 

8.25 

7.90 

8.20 

7.15 

9.95 

7.30 

8  (19) 

7.80 

8.20 

8.10 

7.60 

7.20 

7.10 

6.40 

7.25 

6.50 

6.20 

8.00 

7.75 

6.65 

7.00 

6.45 

6.70 

6.25 

7.15 

7.20 

7.50 

9  (22) 

6.00 

5.85 

6.25 

6.60 

6.10 

6.00 

6.35 

6.20 

6.85 

7.50 

6.30 

6.20 

6.55 

7.20 

7.35 

7.30 

7.50 

6.10 

6.15 

6.75 

Remarks : 


New  vegetation  Inhibited  a  measurement  of  ssu(5)  of  the  fourth  psu. 


Table  2.     Measurements  at  time  t  +  At, 


CT: 


EROSION  BRIDGE  DATA  SHEET 

Study  Area:    Example  3 From:    February  17,  1981   to    Hay  16,  1981 

Total  Time:    0.24  Year  Measurement  Units:    Inches 


SSU'S 
PSU'S  (left  to  right  facing  upslope) 

Number    1    2     3    4    5     6     7    8    9     10    11    12    13    14     15   16     17    18    19    20   ^   o\       d\     d\ 


1  (4^  -.65  -.85  -.85  -.20  -.40  -.75  -1.15  -.15  -.40  -.10  -.10  -.20  -.80  -1.05  -.80  -1.00  -.80  -.40  .60  -.15  20  -10.20  -.5100  .2601 

2  (6)  -.30  -.05  -.15  -.75  -.75  -.15  -.25  -.75  .00  .00  -.60  -.65  1.10  -.55  -.15  -1.00  -2.15   .90  .30   .70  20  -5.25  -.2625  .0689 

3  (9)  -.10  -.60  -2.15  -2.05  -1.75  -2.90  -.95  .10  -.50  -.15  -3.00  -1.90  -.80  -1.60  .30   .65  -1.55  -2.70  2.00   .80  20  -18.85  -.9425  .8883 

4  (11)  .45  -.45  -.75  -.65   **  -.40  -.65  .00  .00  .00   .00  -1.90  -.70  .75  .35  -.65   .80  -.70  -1.30  -1.65  19  -7.45  -.3921  .1537 

5  (12)  .00  -.05  -1.25  -.90  -.45  -.65  -.40  -.25  -.35  .10  -.60   .10  .25  -.60  .50  -.15  -.50  -.40  -.10   .00  20  -5.70  -.2850  .0812 
6(13)  .10  -.10   .20  -.25-1.05  -.10  -.55  -.50  -1.95  -1.20  -2.35  -3.20  -1.00  1.75-2.90   .10  3.30  -.40  -.65   .20  20-10.55  -.5275  .2783 

7  (17)  -.55  .30  -.05  -.65   .15  -.25  -.05  -.35  -.10  .00  -1.65  -1.15  -.50  -1.40  .00  -.75  -.45  -1.60  .95-1.20  20  -9.30  -.4650  .2162 

8  (19)  -.70  -.10   .00  .20   .10  .10  -.05  .15  .40  -.30   .00  -.60  -.70  -.50  -.80   .35  -2.00  -1.30  -.70  -.80  20  -7.25  -.3625  .1314 

9  (22)  .00  -.35  -.35  -.05  -.60  -.70  -.55  -.45  -.65  -.05  -1.05  -.65  -.40  .70  .60   .80  1.20   .00  .10   .45  20  -2.00  -.1000  .0100 


Total   -3.8471  2.0881 


A  -  64  acres  P  =  .81       o  =  1.21 

d  =  -.43  S(d)  -  .08 

r  -  -245.65  S(r)  -  45.70 

M  -  -3056.24  S(M)  -  568.60 


-  The  numbers  in  parentheses  are  the  original  psu  numbers. 


Table  3.     Differences  in  the  soil   level   between  time  t  and  time  t  +  At. 


18 


Table  3  contains  a  summary  of  the  details  necessary  to 
compute  the  average  amount  of  soil  eroded,  d,  and  its  variance, 

S  (d).  The  sample  values  of  d.  and  d.  are  recorded  under  the 

-2 
the  appropriate  columns  in  Table  3  as  well  as  d.. 

For  example, 

dx  =  [(-.65)  +  (-.85)  +  .  .  .  +  (.60)  +  (-.15)]  =  -10.20 

dx  =  (-10.20)/20  =  -.5100, 


and 


d2  =  (-.5100)2  =  .2601. 


Similarly, 

d4  =  [(.45)  +  (-.45)  +  .  .  .  +  (-1.30)  +  (-1.65)]  =  -7.45, 


d4  =  (-7.45)719  =  -.3921, 


and 


d2  =  (-.3921)2  -  .1537. 


-9 

The  totals  of  the  two  columns  d-  and  d.  are 


[(-.5100)  +  (-.2625)  +  .  .  .  +  (-.3625)  +  (-.1000)]  =  -3.8471 


and 


[(.2601)  +  (.0689)  +  .  .  .  +  (.1314)  +  (.0100)]  -  2. 


respectively. 


Thus, 


0881. 


d  =  (I  d.)/n  =  (-3.8471)/9  =  -.4275  -  -.43 


and 


S2(d)  =  i=1 


z  d?  -  n  *  d2 


n  *  (n  -  1) 


,  2.0881  -  9  *  (-.4275)' 
9*8 


=  0.0062 


with 


S(d)  =  .0787  =  .08, 


To  compute  r  you  multiply  d  by  the  constant 


c  =  113.31  *  1.21/(.24)  =  571.27, 


This  gives 


r  »  571.27  *  (-.43)  =  -245.65  tons/ acre/year 


20 

and 

S(f)  =  571.27  *  (.08)  =  45.70  tons/ acre/year. 

Similarly,  you  compute  M  by  multiplying  d  by  the  constant 

c  =  113.31  *  (.81)  *  64  *  1.21  =  7107.53 

giving  us 

M  =  7107.53  *  (-.43)  =  -3056.24  tons 

and 

S(M)  =  7107.53  *  (.08)  =  568.60  tons. 

Confidence  Intervals 

As  stated  earlier,  a  project  area  will  belong  in  one  of  two 
categories: 

1.  A  site  such  as  a  timber  sale  on  which  management  activity  has 
taken  place. 

2.  A  site  on  which  management  activity  will  take  place. 
Let  us  consider  each  case  separately  and  develop  the  appropriate 
confidence  intervals. 


21 


Category  1.  Suppose  r  as  developed  in  the  previous  section  is  an  unbiased 
estimator  of  the  true  population  parameter,  R.  A  1  -  a  confidence  interval 
on  R  is  given  by  (L,  U)  where 

L  -  r  -  S(r)  *  t(l  -  a/2;  n  -  1)  [6] 

and 

U  =  r  +  S(r)  *  t(l  -  a/2;  n  -  1).  [7] 

The  term  t(l  -  a/2;  n  -  1)  is  the  (1  -  a/2)  *  100  percent  quantile  of 
the  t-distribution  with  n  -  1  degrees  of  freedom.  That  is,  if  you  are 
given  a  t-distribution  with  n  -  1  degrees  of  freedom,  then  the  probability 
of  obtaining  a  value  less  than  t(l  -  a/2;  n  -  1)  is  1  -  a/2.  The  values 
of  t(l  -  a/2;  n  -  1)  may  be  taken  from  Table  A. 2  in  Appendix  A.  Example  4 
illustrates  how  we  use  the  table. 


Example 

4.  Finding  a 

value  in 

the 

t-tabl 

es. 

Suppose  you  want  the  value 

of  t(.90, 

8). 

Turn 

to 

Table  A. 2 

and  go  down  the  co 

lumn  "f"  until 

you  come 

to 

an  8. 

Go 

across 

this  row  until  you 

intersect  the 

column  " 

3.90 

."  The  number  at 

which  row  "8"  and 

column  "0.90" 

intersect 

is 

the  va" 

ue 

of 

t(.90,  8).  It  is 

1.3968. 

22 


The  confidence  interval  given  above  either  will  contain  R  or  it  will 
not.  However,  if  you  could  repeat  the  process  used  to  estimate  R  with  r 
over  and  over  ad  infinitum,  then  R  would  belong  to  (1  -  a)  *  100  percent 
of  the  resulting  confidence  intervals. 

Suppose  that  you  are  interested  in  how  R  compares  to  a  standard  or  any 
given  constant,  call  it  K.  You  can  use  the  confidence  interval  to  decide 
whether  or  not  R  is  statistically  different  from  K  at  the  a  *  100  percent 
significance  level.  If  K  is  such  that  L  <  K  <  U,  then  we  say  that  R  is 
not  statistically  different  from  K.  Consider  this  example. 


Example  5. 

Putting  a  confidence  ■ 

nterval  on  the  erosion  rate. 

Suppose 

you 

want  to  | 

jut  an  80 

percent  confidence  " 

nterval 

on  R  for  the 

project  area 

in  Exampl 

e  3.  Go  to  Table  A 

2  and  look 

up  the  value 

of 

t(.90,  8) 

(because 

a/2  =  .10 

and  n  -  1 

=  8).  It 

is  1.3968. 

Then 

compute 

L  = 

-245.65  - 

(45.70)  * 

(1.3968) 

=  -309.48 

and 

U  = 

-245.65  + 

(45.70)  * 

(1.3968) 

=  -181.82 

This  confidence 

interval, 

(-309.48, 

-181.82), 

does  not 

include  0, 

and 

both  values 

are 

negative. 

Thus  yoi 

conclude 

that  the  | 

)roject  area 

is 

losing  soil. 

23 


Category  2.  In  this  case,  take  two  separate  samples  on  the  project 
area.  One  is  taken  before  a  given  management  practice  and  the  other  after 
practice  implementation.  With  each  sample  you  estimate  the  mean  rate  of 
change  of  the  soil  level  in  terms  of  tons  per  acre  per  year  according  to 
the  procedures  developed  in  the  previous  section.   Denote  the  first  estimate 

by  r-^  and  the  second  by  r2,  they  are  estimates  of  R-^  and  R2,  respectively. 

It  is  assumed  that  r,  and  r2  have  a  common  variance,  a  ,  such  that 

2=p         2=2  2 

a   (r-jj  =  a  /rij,  and  o '  (r2)  =  a  /n2>  An  unbiased  estimator  of  a     is  given 

by 


2   nx  *  (r^  -  1)  *  S2(rx)  +  n2  *  (n2  -  1)  *  S2(r2) 
SP  =  ~ (n1  +  n2  -  2) ■         ^8] 


2  ■       2  = 
where  S  (r,)  and  S  (r2)  are  calculated  according  to  Eq.  [3]  on  page  13. 

Denoting  the  difference  between  R"2  and  R-,  by  D  where  D  =  R~2  -  R,,  its 

estimator  is  given  by  D  ■  r2  -  r\.  The  estimator  of  the  variance  of  D  is 


s2(fi»  "  sp  *  (h  *  h)  m 


To  put  a  (1  -  a/2)  confidence  interval,  (L,  U),  on  D  we  calculate  L  by 

L  -  D  -  S(D)  *  t(l  -  a/2;  n,  +  n2  -  2) 

and  U  by 


U  =  D  +  S(D)  *  t(l  -  a/2;  nx+  n  z~   2). 


24 

If  the  confidence  interval  contains  0,  then  you  conclude  that  the  erosion 
rates  are  not  statistically  different.  Example  6  illustrates  this  concept. 


Examp" 

e  6.  Putting 

a  confidence  interval 

on  the  d" 

fference 

between 

two  erosion  rates. 

Suppose  before  you 

drew  the  sample  given 

in  Examp 

le  3 

another  S( 

imple  of  size 

line  had  been  drawn  on 

the 

same 

project 

area.  In 

that  sample, 

r  was  evaluated  to  be 

-154. 

24  with  a 

standard 

error  of  47.21 

.  You  want  to  compare 

the 

erosion  rates 

from  the 

two  samples. 

Hence  you  have 

~r  1  " 

-154.24 

S2(rx)  =  (47.21)2  = 

2228 

.78 

r2- 

-245.65 

S2(r2)  =  (45. 70)2  = 

2088 

.49 

D  = 

-91.41, 

and 

V 

=  n2  =  9. 

Calculat- 

ing  S  you  get 

s2 

9  *  8  *  (47. 21)2  +9*8*  (45. 70)2 

19,427.73 

P 

(9+9-2) 

25 
Estimating  the  variance  of  D  you  get 


S2(D)  =  19,427.73  *(—+—)  =  4,317.27 


with  standard  error  of  S(D)  =  65.71.  The  values  of  L  and  U  for 
an  80  percent  confidence  interval  are  given  by 

L  =  -91.41  -  (65.71)  *  (1.3368)  -  -179.25 

and 

U  =  -91.41  +  (65.71)  *  (1.3368)  =  -3.57. 

Because  the  confidence  interval  does  not  contain  0,  you  conclude 
that  the  two  rates  are  different.  . 


When  using  confidence  intervals  to  make  decisions  as  you  did  in 
Categories  1  and  2,  you  can  make  two  kinds  of  mistakes.  The  first  one, 
called  Type  I  error,  is  this: 

Given  that  R  and  K  (or  R,  and  R*2)  are  equal,  you  obtained  an  unusual 
sample  causing  you  to  conclude  that  they  are  different.  If  you  could 
conduct  the  sampling  over  and  over  under  the  same  circumstances  you 
would  expect  this  mistake  to  occur  about  a  *  100  percent  of  the  time. 
The  second  error  is  called  a  Type  II  error.  It  is  as  follows: 

Given  that  R  and  K  (or  R\  and  Rp)  are  not  the  same,  you  conclude  that 
they  are.  You  would  incorrectly  come  to  this  conclusion  about 
B  *  100  percent  of  the  time  if  you  could  sample  over  and  over  under 
the  same  conditions. 


26 


The  terms  a  and  e  are  referred  to  as  the  probabilities  of  making  a  Type  I 
error  and  a  Type  II  error,  respectively. 

The  probabilities  of  making  the  two  types  of  errors,  a.   and  b,  are 
inversely  related.  Assuming  that  you  are  working  with  a  fixed  sample 
size,  an  increase  in  a  is  accompanied  by  a  corresponding  decrease  in  8. 
Similarly,  a  decrease  in  a  causes  an  increase  in  e.  Thus  you  need  to 
consider  the  consequences  of  making  a  wrong  decision  when  you  choose  a 
and  b.  If  it  is  important  to  detect  a  difference  when  one  exists— for 
example,  to  avoid  contaminating  a  public  water  supply — you  should  make  e 
small.  On  the  other  hand,  if  it  is  important  that  you  not  say  a 
difference  exists  when  it  does  not — for  example,  when  it  is  used  as 
evidence  to  shut  down  a  logging  operator — make  a   small  (Ponce  1980). 

Determining  the  Sample  Size 

One  of  the  first  questions  in  a  study  is  "How  large  of  a  sample  should 
I  take?"  You  do  not  want  to  take  a  sample  that  is  so  small  that  the 
resulting  estimate  is  too  inaccurate  to  be  of  use.  On  the  other  hand, 
you  do  not  want  to  take  a  sample  that  is  so  large  that  it  is  a  waste  of 
time  and  money  (Snedecor  and  Cochran  1967,  p.  516).  There  is  not  any  one 
answer  that  we  can  give  to  the  above  question  as  the  sample  size  depends 
on: 

1.  The  probability  of  making  a  Type  I  error,  a. 

2.  The  probability  of  making  a  Type  II  error,  e. 

3.  The  size  of  the  difference  we  want  to  be  able  to  detect,  6. 

2  = 

4.  A  prior  estimate  of  the  variance  S  (r). 


27 

You  set  the  levels  for  a  and  e  based  on  your  professional  knowledge  and 
consideration  of  the  seriousness  of  making  either  error.  You  choose  a 

value  for  6  also  based  on  your  professional  knowledge;  and  you  may  assign 

2  = 
a  value  of  S  (r)  using  data  from  a  similar  site  or  from  a  previous  sample 

2  = 
on  the  same  site.  Assuming  that  S  (r)  was  calculated  from  a  sample  with 

2      2        2  — 
n  psu's,  we  define  a  new  term,  S  ,  as  S  ■  n  *  S  (r).  The  terms  6  and 

S  =/$   should  both  be  given  in  the  same  units. 

You  can  approximate  the  sample  size  necessary,  n.— {  by 


S  *  ft  +  t. ) 

>  f — (to  hj  [io] 


The  value  of  tQ  is  the  t  value  associated  with  making  a  Type  I  error; 
and  t,  is  the  t  value  associated  with  making  a  Type  II  error.  The  value 
of  tQ  is  the  tabulated  t  value  for  probability  1  -  a/2;  and  the  value  of 
^  is  the  tabulated  t  value  for  probability  1-8.  You  use  successive 
approximations  to  n.  because  the  degrees  of  freedom  associated  with  tQ 
and  tn  depend  upon  n.. 

For  further  discussion  on  the  choice  of  the  sample  size  see  Sokal  and 
Rohlf  (1969,  p.  246-249)  and  Steel  and  Torrie  (I960,  p.  84-86,  154-156), 
much  of  which  was  the  basis  for  this  section.  Equation  [10]  is  a 

modification  of  their  results.  Because  we  are  assuming  that  you  want  to 

2 
detect  a  difference  between  a  mean  and  a  constant,  K,  we  use  S  .  They 


4/ 

—  We  make  use  of  the  fact  that  n-t  is  an  integer  and  use  the  next 

integer  greater  than  or  equal  to  n^  in  Eq.  [10]. 


28 


are  assuming  that  you  want  to  detect  the  difference  between  two  means  from 

2         2 
populations  with  the  same  variance,  a  ,  of  which  S  is  an  estimator. 

2  2 

In  their  case  2  *  S  should  replace  S  in  Eq.  [10]. 

The  results  from  Example  3  are  used  in  the  following  example  to 

illustrate  how  the  sample  size  needed  for  future  studies  is  approximated. 


Example  7.  Determining  the  sample  size  using  the  sample  variance 
from  a  previous  study. 

Suppose  we  have  decided  on  the  following  values  for  a,  e,  and 

6,  respectively:  .20,  .25,  and  70.  Also  assume  that  from  previous 

p  =         9 
results  you  have  a  value  of  S  (r)  =  (45.70)  based  on  a  sample 

2 
size  of  9  psu's;  thus  S  ■  18,796.41.  For  your  first  approximation 

of  nt,  you  use  tg  =  t(.90,  8)  and  t^  =  t(.75,  8).  Estimating 

nt  you  have 


n  =  18,796.41  *  (1.3968  +  .7064p  =  16  97  =  17 
t_  (70)' 


For  your  second  approximation  you  use  tQ  =  t(.90,  17)  and 
t^   =  t(.75,  17).  This  gives 


n  =  18,796.41  *  (1.3334  +  .6892)'  =  ^  „  ^ 

t  uor 


29 


Your  third  approximation  uses  tQ  =  t(.90s   16)   and  t,  =  t(.75,   16) 
giving 


m  18,796.41  *   (1.3368  ♦   .6901)2  =  ^  =  lfi< 

1  {7or 


Since  your  second  and  third  approximations  give  the  same  values 
of  nt,  you  stop  and  use  nt  =  16. 


30 


COMMENTS 

The  purpose  of  this  document  is  to  describe  a  procedure  of  measuring 
soil  erosion  on  a  given  site.  The  specific  reasons  for  measuring  soil 
erosion  are  the  responsibility  of  the  person  using  this  tool.  You  must 
think  about  cause  and  effect  relationships  between  management  practices 
and  soil  erosion  as  well  as  soil  erosion  effects  on  productivity  and 
sediment.  If  data  are  collected  for  variables  in  a  soil  erosion  model  at 
the  same  time  erosion  is  being  measured,  you  can  compare  the  predictive 
capabilities  of  the  model  with  actual  results.  In  a  similar  way,  data 
about  vegetative  productivity  and/or  water  quality  could  be  used  to 
evaluate  soil  loss  tolerance  values  or  to  establish  relationships  between 
soil  erosion  and  vegetative  productivity  for  selected  soils  and 
management  practices. 

Although  this  method  of  measuring  soil  erosion  is  simple,  several 
factors  can  affect  your  results.  First,  the  elevation  of  a  stake  may 
change  between  measurements  because  it  was  struck  or  the  soil  shrank 
and/or  swelled  during  freeze-thaw  or  wet-dry  cycles.  If  one  stake  of  a 
pair  has  moved  more  than  the  other  the  erosion  bridge  will  not  be  level, 
thus  providing  an  indicator  of  disturbance.  In  such  cases,  you  should 
relevel  the  stakes  and  use  the  new  measurements  in  future  comparisons. 

Second,  there  can  be  changes  in  the  actual  soil  surface  elevation  due 
to  soil  bulk  density  changes.  An  increase  or  decrease  in  bulk  density 
between  successive  measurements  may  affect  your  interpretation  of  the 
data,  causing  you  to  incorrectly  conclude  that  either  soil  loss  or 


•, 


31 


deposition  has  occurred  because  of  changes  in  the  soil  surface 
elevation.  Following  are  some  examples  of  the  effect  of  short-term  and 
long-term  soil  bulk  density  changes  on  soil  surface  elevation. 

Soil  shrink-swell  cycles  will  be  considerably  shorter  than  the  total 
time  period  used  for  erosion  measurements.  Therefore,  when  you  take 
measurements  several  times  during  site  recovery,  the  shrink-swell  effects 
on  measurements  are  expected  to  be  self-compensating  around  a  mean  soil 
elevation.  On  the  other  hand,  erosion  is  expected  to  show  a  cumulative 
effect  over  time.  Soil  disturbances  may  cause  a  temporary  reduction  in 
bulk  density  causing  you  to  overestimate  soil  loss  if  your  initial 
measurements  are  made  before  the  soil  reconsoli dates  following  a  site 
disturbance  by  management  practices. 

Heavy  animal  use  can  cause  variations  in  the  soil  bulk  density  over 
an  area.  For  areas  with  continual  animal  use,  a  preliminary  study  may 
indicate  the  need  for  a  large  number  of  sampling  locations  to  compensate 
for  the  variability  caused  by  the  animal  use.  In  some  cases  the  number 
of  sampling  locations  needed  to  compensate  for  this  variability  may  make 
it  impractical  to  use  the  erosion  bridge. 

Organic  matter  content  is  another  factor  affecting  soil  bulk  density 
(Adams  1973).  On  sites  where  there  is  extensive  decomposition  large 
variations  in  the  bulk  density  are  likely.  Using  an  erosion  bridge  under 
these  conditions  would  not  be  recommended. 

Finally,  to  make  the  most  effective  use  of  this  procedure  it  should 
be  part  of  a  formal  monitoring  plan.  Simply  put,  the  purpose  of  any 
study  is  to  provide  information  about  a  problem  or  question.  Many 


32 


studies  have  been  initiated  without  clear  objectives  at  the  outset.  The 
liklihood  of  a  successful  study  in  such  cases  is  very  low.  Preparation 
of  a  detailed  plan  will  help  define  the  monitoring  objectives  and 
facilitate  a  logical  approach  to  data  collection.  A  carefully  prepared 
plan  describes  the  problem,  the  kind  of  information  required,  how  the 
data  will  be  collected  and  analysed,  and  when  the  project  and  reports  are 
to  be  completed.  Furthermore,  a  monitoring  plan  should  provide  continuity 
even  if  personnel  changes  occur. 


33 


REFERENCES 


Adams,  W.  A.  1973.  The  effects  of  organic  matter  on  the  bulk  and  true 
densities  of  some  uncultivated  podzolic  soils.  Journal  of  Soil 
Science  24:10-17. 

DeVries,  P.  G.  1973.  A  general  theory  on  line  intersect  sampling  with 
application  to  logging  residue  inventory.  Madelingen  Landbouw 
Hogeschool  No.  73-11.  Wageningen,  Netherlands.  23  p. 

Hill,  Sir  Austin  Bradford.  1962.  Statistical  methods  in  clinical  and 
preventive  medicine.  E.  3  S.  Livingstone  Ltd.  London.  610  p. 

Howes,  Steve,  John  Hazard,  and  J.  Michael  Geist.  1981.  Interim 

guidelines  for  sampling  soil  resource  conditions.  Publication  No. 
R6-WM-066-1981.  Pacific  Northwest  Region,  USDA  Forest  Service, 
Portland,  Oregon.  20  p. 

Megahan,  W.  F.  1974.  Erosion  over  time  on  severely  disturbed  granitic 
soils:  A  model.  USDA  Forest  Service  Research  Paper  INT-156. 
Intermountain  Forest  and  Range  Experiment  Station,  USDA  Forest 
Service,  Ogden,  Utah.  14  p. 

Megahan,  Walter  F.,  and  Walter  J.  Kidd.  1972.  Effect  of  logging  roads 
on  sediment  production  rates  in  the  Idaho  Batholith.  USDA  Forest 
Service  Research  Paper  INT-123.  Intermountain  Forest  and  Range 
Experiment  Station,  USDA  Forest  Service,  Ogden,  Utah.  14  p. 

Pickford,  Stewart  G.,  and  John  W.  Hazard.  1978.  Simulation  studies  on 
line  intersect  sampling  of  forest  residue.  Forest  Science  24:469-483, 

Ponce,  S.  L.  1980.  Statistical  methods  commonly  used  in  water  quality 
data  analysis.  WSDG  Technical  Paper  WSDG-TP-00001.  Watershed 
Systems  Development  Group,  USDA  Forest  Service,  Fort  Collins, 
Colorado.  136  p. 

Ranger,  Gerald  E.,  and  Franklin  F.  Frank.  1978.  The  3-F  erosion 

bridge — A  new  tool  for  measuring  soil  erosion.  Publication  No.  23. 
Department  of  Forestry,  State  of  California,  Sacramento,  California. 
7  p. 

Sokal,  Robert  R.,  and  F.  James  Rohlf.  1969.  Biometry.  W.H. 
Freeman  and  Company.  San  Francisco.  776  p. 

Snedecor,  George  W.,  and  William  G.  Cochran.  1967.  Statistical 

methods.  6th  ed.  Iowa  State  University  Press.  Ames,  Iowa.  593  p. 

Steel,  Robert  G.  D.,  and  James  H.  Torrie.  1960.  Principles  and 

procedures  of  statistics  with  special  reference  to  the  biological 
sciences.  McGraw-Hill  Book  Company,  Inc.  New  York.  481  p. 


■HHDBK^BMWaiia 


34 


APPENDIX  A:  Some  Statistical  Tables 


3ft 


Table  A.l.     Ten  thousand  random  digitsJJ/ 
Note:     There  are  4Q  columns  and  50  rows  on  each  page. 


1368 

9621 

9151 

2066 

1208 

2664 

9822 

6599 

6911 

5112 

5953 

5936 

2541 

4011 

0408 

3593 

3679 

1378 

5936 

2651 

7226 

9466 

9553 

7671 

8599 

2119 

5337 

5953 

6355 

6889 

8883 

3454 

6773 

8207 

5576 

6386 

7487 

0190 

0867 

1298 

7022 

5281 

1168 

40  99 

8069 

8721 

8353 

9952 

8006 

9045 

4576 

1853 

7884 

2451 

3488 

1286 

4842 

7719 

5795 

3953 

8715 

1416 

7028 

4616 

3470 

9938 

5703 

0196 

3465 

0034 

4011 

0408 

2224 

7626 

0643 

1149 

8834 

6429 

8691 

0143 

1400 

3694 

448  2 

3608 

1238 

8221 

5129 

6105 

5314 

8385 

6370 

1884 

0820 

4854 

9161 

6509 

7123 

4070 

6759 

6113 

4522 

5749 

8084 

3932 

7678 

3549 

0051 

6761 

6952 

7041 

7195 

6234 

642  6 

7148 

9945 

0358 

3242 

0519 

6550 

1327 

0054 

0810 

2937 

2040 

2299 

4198 

0846 

3937 

3986 

1019 

5166 

5433 

0381 

9686 

5670 

5129 

2103 

1125 

3404 

8785 

1247 

3793 

7415 

7819 

1783 

0506 

4878 

7673 

9840 

6629 

8529 

7842 

7203 

1844 

8619 

7404 

4215 

9969 

6948 

5643 

8973 

3440 

4366 

9242 

2151 

0244 

0922 

5887 

4883 

1177 

9307 

2959 

5904 

9012 

4951 

3695 

4529 

7197 

7179 

3239 

2923 

4276 

9467 

9868 

2257 

1925 

3382 

7244 

1781 

8037 

6372 

2808 

1238 

8098 

5509 

4617 

4099 

6705 

2386 

2830 

6922 

1807 

4900 

5306 

0411 

1828 

8634 

2331 

7247 

3230 

9862 

8336 

6453 

0545 

6127 

2741 

5967 

8447 

3017 

5709 

3371 

1530 

5104 

3076 

5506 

3101 

4143 

5845 

2095 

6127 

6712 

9402 

9588 

7019 

9248 

9192 

4223 

6555 

7947 

2474 

3071 

8782 

7157 

5941 

8830 

8563 

2252 

8109 

5880 

9912 

4022 

9734 

7852 

9096 

0051 

7387 

7056 

9331 

1317 

7833 

9682 

8892 

3577 

0326 

5306 

0050 

8517 

4376 

0788 

5443 

6705 

2175 

9904 

3743 

1902 

5393 

3032 

8432 

0612 

7972 

1872 

8292 

2366 

8603 

4288 

6809 

4357 

1072 

6822 

5611 

2559 

7534 

2281 

7351 

2064 

0611 

9613 

2000 

0327 

6145 

4399 

3751 

9783 

5399 

5175 

8894 

0296 

9483 

0400 

2272 

6074 

8827 

2195 

2532 

7680 

4288 

6807 

3101 

6850 

6410 

5155 

7186 

4722 

6721 

0838 

3632 

5355 

9369 

2006 

7681 

3193 

2800 

6184 

7891 

9838 

6123 

9397 

4019 

8389 

9508 

8610 

1880 

7423 

3384 

4625 

6653 

2900 

6290 

9286 

2396 

4778 

8818 

2992 

6300 

4239 

9595 

4384 

0611 

7687 

2088 

3987 

1619 

4164 

2542 

4042 

7799 

9084 

0278 

8422 

4330 

2977 

0248 

2793 

3351 

4922 

8878 

5703 

7421 

2054 

4391 

1312 

2919 

8220 

7285 

5902 

7882 

1403 

5354 

9913 

7109 

3890 

7193 

7799 

9190 

3275 

7840 

1872 

6232 

5295 

3148 

0793 

3468 

8762 

2492 

5854 

8430 

8472 

2264 

9279 

2128 

2139 

4552 

3444 

6462 

2524 

8601 

3372 

1848 

1472 

9667 

8277 

9153 

2880 

9053 

6880 

4284 

5044 

8931 

0861 

1517 

2236 

4778 

6639 

0862 

9509 

2141 

0208 

1450 

1222 

5281 

8837 

7686 

1771 

3374 

2894 

7314 

6856 

0440 

3766 

6047 

6605 

6380 

4599 

3333 

0713 

8401 

7146 

8940 

2629 

2006 

8399 

8175 

3525 

1646 

4019 

8390 

4344 

8975 

4489 

3423 

8053 

3046 

9102 

4515 

2944 

9763 

3003 

3408 

1199 

2791 

9837 

9378 

3237 

7016 

7593 

5958 

0068 

3114 

0456 

6840 

2557 

6395 

9496 

1884 

0612 

8102 

4402 

5498 

04  22 

3335 

5/ 

-  Donald  B.  Owen,  HANDBOOK  OF  STATISTICAL  TABLES,  c  1962.  U.S. 
Department  of  Energy,  pp:  28-30,  519-523.  Published  by  Addi son-Wesley 
Publishing  Co.,  Inc.,  Reading,  MA.  Reprinted  with  permission  of  the 
publisher. 


36 


Table  A.l.  Ten  thousand  random  digits  (cont.). 
Note:  There  are  40  columns  and  50  rows  on  each  page. 


2671 

4690 

1550 

2262 

2597 

8034 

0785 

2978 

4409 

0?37 

9111 

0250 

3275 

7519 

9740 

4577 

2064 

0286 

3398 

linb 

0391 

6035 

9230 

4999 

3332 

0608 

6113 

0391 

5789 

9  926 

2475 

2144 

1886 

2079 

3004 

9686 

5669 

4367 

9306 

2595 

5336 

5845 

2095 

6446 

5694 

3641 

1085 

8705 

5416 

9066 

6808 

0423 

0155 

1652 

7897 

4335 

3  56-7 

7109 

9690 

3739 

8525 

0577 

8940 

9451 

6726 

0876 

3818 

7607 

8854 

3566 

0398 

0741 

8787 

3043 

5063 

0617 

1770 

5048 

7721 

7632 

3623 

9636 

3638 

1406 

5731 

3978 

8068 

7238 

9715 

3363 

0739 

2644 

4917 

8866 

3632 

5399 

5175 

7422 

2476 

2607 

6713 

3041 

8133 

8749 

8835 

6745 

3597 

3476 

3816 

3455 

7775 

9315 

0432 

8327 

0861 

1515 

2297 

3375 

3713 

9174 

8599 

2122 

6842 

9202 

0810 

2936 

1514 

2090 

3067 

3  574 

7955 

3759 

5254 

1126 

5553 

4713 

9605 

7909 

1658 

5490 

4766 

0070 

7260 

6033 

7997 

0109 

5993 

7592 

5436 

1727 

5165 

1670 

2534 

8811 

8231 

3721 

7947 

5719 

2640 

1394 

9111 

0513 

2751 

8256 

2931 

7783 

1281 

6531 

7259 

6993 

1667 

1084 

7889 

8963 

7018 

8617 

6381 

0723 

4926 

4551 

2145 

4587 

8585 

2412 

5431 

4667 

1942 

7238 

9613 

2212 

2739 

5528 

1481 

7528 

9368 

1823 

6979 

2547 

7268 

2467 

8769 

5480 

9160 

5354 

9700 

1362 

2774 

7980 

9157 

8788 

6531 

9435 

3422 

2474 

1475 

0159 

3414 

5224 

8399 

5820 

2937 

4134 

7120 

2206 

5084 

9473 

3958 

7320 

9878 

8609 

1581 

3285 

3727 

8924 

6204 

0797 

0882 

5945 

9375 

9153 

6268 

1045 

7076 

1436 

4165 

0143 

0293 

4190 

7171 

7932 

4293 

0523 

8625 

1961 

1039 

2856 

4889 

4358 

1492 

3804 

6936 

4213 

3212 

7229 

1230 

0019 

5998 

9206 

6753 

3762 

5334 

7641 

3258 

3769 

1362 

2771 

6124 

9813 

7915 

3960 

9373 

1158 

4418 

8826 

5665 

5896 

0358 

4717 

8232 

4859 

6968 

9428 

8950 

5346 

1741 

2348 

8143 

5377 

7695 

0685 

4229 

0587 

8794 

4009 

9691 

4579 

3302 

7673 

9629 

5246 

3807 

7785 

7097 

5701 

6639 

0723 

4819 

0900 

2713 

7650 

4891 

8829 

1642 

2155 

0796 

0466 

2946 

2970 

9143 

6590 

1055 

2968 

7911 

7479 

8199 

9735 

8271 

5339 

7058 

2964 

2983 

2345 

0568 

4125 

0894 

8302 

0506 

6761 

7706 

4310 

4026 

3129 

2968 

8053 

2797 

4022 

9838 

9611 

0975 

2437 

4075 

0260 

4256 

0337 

2355 

9371 

2954 

6021 

5783 

2827 

8488 

5450 

1327 

7358 

2034 

8060 

1788 

6913 

6123 

9405 

1976 

1749 

5742 

4098 

5887 

4567 

6064 

2777 

7830 

5668 

2793 

4701 

9466 

9554 

8294 

2160 

7486 

1557 

4769 

2781 

0916 

6272 

6325 

7188 

9611 

1181 

2301 

5516 

5451 

6832 

5961 

1149 

7946 

1950 

2010 

0600 

5655 

0796 

0569 

4365 

3222 

4189 

1891 

8172 

8731 

4769 

2782 

1325 

4238 

9279 

1176 

7834 

4600 

9992 

9449 

5824 

5344 

1008 

6678 

1921 

2369 

8971 

2314 

4806 

5071 

8908 

8274 

4936 

3357 

4441 

0041 

4329 

9265 

0352 

4764 

9070 

7527 

7791 

1094 

2008 

0803 

8302 

6814 

2422 

6351 

0637 

0514 

0246 

1845 

8594 

9965 

7804 

3930 

8803 

0268 

1426 

3130 

3613 

3947 

8086 

0011 

2387 

3148 

7559 

4216 

2946 

2865 

6333 

1916 

2259 

1767 

9871 

3914 

5790 

5287 

7915 

8959 

1346 

5482 

9251 

37 


Table  A.l.  Ten  thousand  random  digits  (cont„). 
Note:  There  are  40  columns  and  50  rows  on  each  page. 


260* 

307* 

050* 

3828 

7881 

0797 

109* 

*098 

*9*0 

7067 

6930 

*180 

307* 

0060 

0909 

3187 

8991 

0682 

2385 

2307 

6160 

9899 

908* 

570* 

5666 

3051 

0325 

*733 

5905 

9226 

*88* 

1857 

28*7 

2581 

*870 

1782 

2980 

0587 

8797 

55*5 

729* 

2009 

9020 

0006 

*309 

39*1 

56*5 

6238 

5052 

*150 

3478 

*973 

1056 

3687 

31*5 

5988 

*21* 

55*3 

9185 

9375 

176* 

7860 

*150 

2881 

9895 

2531 

7363 

8756 

372* 

9359 

3025 

0890 

6*36 

3*61 

1*11 

0303 

7*22 

268* 

6256 

3*95 

1771 

3056 

6630 

*982 

2386 

2517 

*7*7 

5505 

8785 

8708 

025* 

1892 

9066 

*890 

8716 

2258 

2*52 

3913 

6790 

6331 

8537 

9966 

822* 

9151 

1855 

8911 

**22 

1913 

2000 

1*82 

1*75 

0261 

4*65 

*803 

8231 

6*69 

9935 

*256 

06*8 

7768 

5209 

5569 

8*10 

30*1 

*325 

7290 

3381 

5209 

5571 

9*58 

5*56 

59** 

6038 

3210 

7165 

0723 

*820 

18*6 

0005 

3865 

50*3 

669* 

*853 

8*25 

5871 

1322 

1052 

1*52 

2*86 

1669 

1719 

01*8 

6977 

12** 

6**3 

5955 

79*5 

1218 

9391 

6*85 

7*32 

2955 

3933 

8110 

8585 

1893 

9218 

7153 

7566 

60*0 

*926 

*761 

7812 

7*39 

6*36 

31*5 

593* 

7852 

9095 

9*97 

0769 

0683 

3768 

10*8 

8519 

2987 

012* 

306* 

1881 

3177 

0805 

3139 

851* 

501* 

327* 

6395 

05*9 

3858 

0820 

6*06 

020* 

7273 

*96* 

5*75 

2  6*8 

6977 

1371 

6971 

*850 

6873 

0092 

1733 

23*9 

26*8 

6609 

5676 

6**5 

3271 

8867 

3*69 

3139 

*867 

3666 

9783 

5088 

*852 

*1*3 

7923 

3858 

050* 

2033 

7*30 

*389 

7121 

9982 

0651 

9110 

9731 

6*21 

*731 

3921 

0530 

3605 

8*55 

*205 

7363 

3081 

3931 

9331 

1313 

*111 

92** 

8135 

9877 

9529 

9160 

**07 

9077 

5306 

005* 

6573 

1570 

665* 

3616 

2  0*9 

7001 

5185 

7108 

9270 

6550 

8515 

8029 

6880 

*329 

9367 

1087 

95*9 

168* 

*838 

5686 

3590 

2106 

32*5 

1989 

3529 

3828 

8091 

605* 

5656 

3035 

7212 

9909 

5005 

7660 

2620 

6*06 

0690 

*2*0 

*070 

65*9 

6701 

015* 

8806 

1716 

7029 

6776 

9*65 

8818 

2886 

35*7 

3777 

9532 

1333 

8131 

2929 

6987 

2*08 

0*87 

9172 

6177 

2*95 

305* 

1692 

0089 

*090 

2983 

2136 

89*7 

*625 

7177 

2073 

8878 

97*2 

3012 

00*2 

3996 

9930 

1651 

*982 

96*5 

2252 

800* 

78*0 

2105 

3033 

87*9 

9153 

2872 

5100 

867* 

210* 

222* 

*052 

2273 

*753 

*505 

7156 

5*17 

9725 

7599 

2371 

0005 

38** 

665* 

32*6 

*853 

*301 

8886 

5217 

1153 

3270 

121* 

96*9 

1872 

6930 

9791 

02*8 

2687 

8126 

1501 

6209 

7237 

1966 

55*1 

*22* 

7080 

7630 

6*22 

1160 

5675 

1309 

9126 

2920 

*359 

1726 

0562 

965* 

*182 

*097 

7*93 

2*06 

8013 

363* 

6*28 

8091 

5925 

3923 

1686 

6097 

9670 

7365 

9859 

9378 

70  8* 

9*02 

9201 

1815 

706* 

*32* 

7081 

2889 

*738 

9929 

1*76 

0785 

3832 

1281 

5821 

3690 

9185 

7951 

3781 

*755 

6986 

1659 

5727 

8108 

9816 

5759 

*188 

*5*8 

6778 

7672 

9101 

3911 

8127 

1918 

8512 

*197 

6*02 

5701 

83*2 

2852 

*278 

33*3 

9830 

1756 

05*6 

6717 

311* 

2187 

7266 

1210 

3797 

1636 

7917 

9933 

3518 

6923 

63*9 

9360 

66*0 

1315 

628* 

8265 

7232 

0291 

3*67 

1088 

783* 

7850 

7626 

07*5 

1992 

*993 

73*9 

6*51 

6186 

8916 

*292 

6186 

9233 

6571 

0925 

17*8 

5*90 

526* 

3820 

9829 

1335 

Table  A.l.  Ten  thousand  random  digits  (cont.). 
Note:  There  are  40  columns  and  50  rows  on  each  page. 


38 


6063 

2353 

8531 

8892 

4109 

5782 

2283 

1385 

0699 

5927 

6305 

1326 

4551 

2815 

8937 

2908 

0698 

5509 

4303 

9911 

01*3 

0187 

8127 

2026 

8313 

8341 

2479 

4722 

6602 

2236 

1031 

0754 

7989 

4948 

1804 

3025 

0997 

9562 

3674 

7876 

2022 

3227 

2147 

5613 

2857 

8859 

4941 

7274 

9412 

0620 

91*9 

0806 

9751 

8870 

9677 

9676 

1854 

8094 

7658 

7012 

5863 

0513 

1402 

3866 

8696 

9142 

6063 

2252 

7818 

2477 

8724 

0806 

9644 

8284 

7010 

0868 

9076 

4915 

5751 

9214 

6783 

4207 

2958 

5295 

3175 

3396 

8117 

5918 

1037 

4319 

0862 

1620 

4690 

0036 

9654 

4078 

1918 

8721 

8454 

7671 

939* 

2466 

6427 

5395 

9393 

0520 

7074 

0634 

5578 

4023 

3220 

3058 

7787 

7706 

4094 

5603 

3303 

8300 

6185 

8705 

1491 

3503 

0584 

7221 

6176 

0116 

0309 

1975 

0910 

3535 

4368 

5705 

8579 

5790 

7244 

6547 

8495 

7973 

1805 

7251 

2325 

4026 

2919 

8327 

0267 

2616 

6572 

8620 

8245 

6257 

0591 

1775 

5134 

8709 

7373 

3332 

0507 

5525 

7640 

2840 

3471 

1461 

1149 

6798 

6070 

9930 

1862 

3672 

6718 

3849 

2600 

9885 

6219 

3668 

1005 

5418 

5832 

0416 

4220 

4692 

9572 

7874 

6034 

4514 

2628 

1693 

0628 

2200 

9006 

3795 

0822 

2790 

9386 

5783 

2689 

2565 

1565 

0349 

3410 

5216 

4329 

3028 

2549 

2529 

9434 

3083 

6800 

8569 

9290 

8298 

9289 

5212 

2355 

9367 

1297 

1638 

9282 

3720 

7178 

2695 

3932 

9960 

3399 

1700 

8253 

1375 

4594 

6024 

1223 

5383 

2282 

0648 

7561 

7528 

5870 

7907 

0713 

8608 

9682 

8576 

9933 

3416 

5957 

2574 

5553 

5534 

4707 

3206 

0963 

2459 

9015 

6416 

6603 

2967 

7591 

5013 

2878 

8424 

5452 

4659 

1539 

0719 

2637 

9969 

8450 

4489 

3528 

3364 

1459 

9708 

6849 

5595 

7969 

2582 

5627 

1920 

9772 

8560 

0892 

6500 

2523 

7769 

3536 

9611 

1079 

1694 

1254 

4195 

5799 

5928 

0701 

7355 

0587 

8878 

3446 

1137 

7690 

0647 

1407 

6362 

2163 

8543 

4594 

6022 

0496 

8648 

2999 

1262 

6702 

0811 

0327 

5727 

1070 

5996 

8660 

9024 

2135 

9799 

8414 

9136 

2169 

3160 

8707 

6361 

6339 

4054 

3251 

7397 

3480 

5805 

8393 

8147 

5360 

4150 

2990 

3380 

1789 

7436 

4781 

0337 

9726 

9151 

2064 

0609 

SS^ 

9095 

9737 

2897 

6510 

8891 

0515 

2296 

2636 

9756 

5313 

7754 

0916 

6066 

3905 

1298 

0649 

8398 

5614 

0140 

3155 

2211 

4988 

3674 

7663 

0620 

0026 

9426 

8005 

8579 

5774 

7962 

5092 

5856 

1626 

0980 

3422 

0092 

1626 

1298 

2475 

1997 

9796 

7076 

1541 

1731 

8191 

1983 

9164 

1885 

5468 

8216 

4327 

8109 

5880 

9804 

7408 

0486 

7654 

4829 

2711 

6592 

4785 

5901 

7147 

9314 

8261 

9440 

8118 

6338 

8157 

9052 

9093 

8449 

4066 

4894 

9274 

8838 

8342 

3114 

0455 

6212 

8862 

6701 

0099 

0501 

2699 

0383 

1400 

3484 

1492 

4683 

5369 

3851 

5870 

0903 

8740 

0349 

3502 

3971 

9960 

6325 

6727 

4715 

2945 

9938 

0247 

2372 

0424 

0578 

0036 

1619 

4479 

7108 

8520 

1487 

5136 

9444 

8343 

1152 

3615 

1420 

8923 

7307 

3978 

5724 

4844 

8931 

0964 

2878 

8212 

9328 

2656 

1965 

4805 

0634 

0205 

8457 

4333 

2555 

5353 

9201 

1606 

2715 

4014 

1877 

2517 

5061 

7642 

3891 

7713 

7066 

5435 

1200 

7455 

5562 

Table  A.l.  Ten  thousand  random  digits  (cont.). 
Note:  There  are  40  columns  and  50  rows  on  eacn  oa^e, 


39 


8529 

7631 

7050 

2275 

4383 

0162 

1937 

0302 

7109 

9024 

4272 

3581 

6632 

5942 

9513 

6119 

7721 

7033 

4632 

6904 

6009 

1247 

3898 

2058 

6466 

8697 

9562 

3254 

9644 

8076 

8714 

0867 

1189 

9909 

5113 

7899 

7558 

2765 

5076 

2377 

5862 

5635 

9795 

6127 

2872 

5351 

3380 

2010 

2836 

4794 

2834 

0662 

4423 

2226 

8886 

4902 

6038 

3214 

9241 

1735 

4537 

0620 

3480 

6208 

6784 

4835 

4567 

5961 

1202 

8426 

4090 

3275 

7424 

3820 

0108 

5045 

9145 

7824 

8365 

5893 

0827 

6802 

9469 

4171 

7282 

4667 

1838 

5054 

5145 

5126 

9104 

5561 

0764 

6940 

6960 

2201 

9636 

0020 

1845 

1992 

6395 

0676 

0014 

3881 

1247 

3897 

1951 

2503 

2864 

6120 

5860 

4900 

5410 

2079 

3125 

7185 

3593 

3680 

1846 

9771 

4869 

7923 

3859 

1031 

4559 

2451 

3592 

3470 

9728 

3145 

9482 

9765 

1065 

6314 

9731 

7905 

8245 

6366 

6283 

7678 

5106 

6509 

7110 

1452 

2693 

2047 

9311 

9242 

3977 

5411 

4932 

9177 

1521 

8351 

9323 

6138 

0147 

6597 

2757 

2492 

6979 

7002 

5668 

8975 

4593 

5470 

3210 

7432 

9644 

8830 

6392 

0329 

6775 

9257 

5190 

0858 

9602 

5415 

7922 

3333 

8115 

4891 

9133 

8740 

0349 

3183 

3662 

0602 

9763 

6661 

3962 

2652 

9426 

8003 

7631 

7262 

8127 

1919 

9353 

9323 

6760 

6323 

5492 

6287 

9442 

4659 

1136 

7479 

7883 

1716 

6216 

2155 

0832 

7787 

8022 

2445 

7512 

2459 

5158 

6907 

9684 

9605 

6838 

9176 

0684 

3202 

5595 

8180 

5352 

9092 

4656 

5208 

5255 

1654 

0591 

2192 

0875 

3609 

7364 

7593 

5664 

6896 

1992 

4789 

8101 

8005 

8682 

6839 

9601 

5515 

3987 

2522 

7665 

1561 

8164 

2106 

3559 

9274 

8626 

2168 

4786 

1354 

7318 

0697 

1381 

6380 

9580 

7420 

1848 

1367 

2430 

6737 

8440 

8489 

5691 

9136 

4691 

0560 

8603 

4510 

8616 

5854 

9854 

0481 

5877 

8567 

8239 

8512 

4199 

7558 

6225 

3511 

5336 

8899 

6965 

8279 

0529 

9118 

8078 

4301 

6654 

3405 

9538 

4950 

3170 

5234 

4458 

2951 

2062 

4167 

5303 

4532 

7187 

8899 

6975 

0474 

2187 

7372 

4547 

9630 

7486 

2727 

9740 

2195 

2112 

8781 

2406 

8221 

7598 

7981 

4999 

3648 

8313 

8612 

0078 

1904 

6313 

9176 

0997 

9562 

0680 

1020 

8141 

3907 

2033 

7323 

7096 

7287 

3041 

4010 

7401 

8220 

4851 

3593 

3678 

1271 

2563 

4197 

6717 

3325 

0066 

2090 

2961 

0765 

7437 

5197 

7929 

9274 

1368 

9618 

1882 

4099 

9294 

4850 

7082 

8577 

4762 

8229 

2863 

5313 

8231 

3741 

4893 

9976 

6201 

2429 

6105 

5312 

7335 

4463 

9402 

9619 

1779 

3147 

7140 

1983 

9160 

0027 

1270 

1678 

1637 

8438 

7649 

1448 

0273 

5717 

1825 

7941 

8345 

2317 

2103 

2096 

9248 

9294 

4958 

3694 

0085 

6993 

9408 

3834 

4465 

4803 

8562 

8795 

4919 

9770 

6864 

8402 

8091 

5789 

3852 

2250 

6978 

6770 

9076 

4797 

8036 

1515 

2497 

1095 

0793 

5024 

0470 

8933 

1793 

2906 

8995 

0565 

7505 

7778 

7311 

5316 

9218 

3850 

5241 

4709 

4709 

4254 

7349 

6769 

1779 

2728 

7148 

9941 

8777 

0434 

9693 

5512 

2251 

7733 

6105 

5105 

3599 

7389 

0183 

2034 

7642 

3995 

9766 

4578 

8698 

2587 

2087 

7638 

1872 

0418 

6340 

9451 

6937 

4530 

3937 

3988 

2038 

1144 

2  640 

1631 

4008 

1921 

4546 

9104 

40 


Table  A. 2.  Critical  values  for  student's  t-distribution.l/ 


► 


Pr {Students  t  <  tabl 
7 

ed  value} 

=  7 

l 

0.75 

0.90 

0.95 

0.975 

0.99 

0.995 

1 

1.0000 

3.0777 

6.3138 

12.7062 

31.8207 

63.6574 

2 

0.8165 

1.8856 

2.9200 

4.3027 

6.9646 

9.9248 

3 

0.7649 

1.6377 

2.3534 

3.1824 

4.5407 

5.8409 

4 

0.7407 

1.5332 

2.1318 

2.7764 

3.7469 

4.6041 

5 

0.7267 

1.4759 

2.0150 

2.5706 

3,3649 

4.0322 

6 

0.7176 

1.4398 

1.9432 

2.4469 

3.1427 

3.7074 

? 

0.7111 

1.4149 

1.8946 

2.3646 

2.9980 

3.4995 

6 

0.7064 

1.3968 

1.8595 

2.3060 

2.8965 

3.3554 

9 

0.7027 

1.3830 

1.8331 

2.2622 

2.8214 

3.2498 

to 

0.6998 

1.3722 

1.8125 

2.2281 

2.7638 

3.1693 

11 

0.6974 

1.3634 

1.7959 

2.2010 

2.7181 

3.1058 

12 

0.6955 

1.3562 

1.7823 

2.1788 

2.6810 

3.0545 

13 

0.6938 

1.3502 

1.7709 

2.1604 

2.6503 

3.0123 

14 

0.6924 

1.3450 

1.7613 

2.1448 

2.6245 

2.9768 

15 

0.6912 

1.3406 

1.7531 

2.1315 

2.6025 

2.9467 

16 

0.6901 

1.3368 

1.7459 

2.1199 

2.5835 

2.9208 

17 

0.6892 

1.3334 

1.7396 

2.1098 

2.5669 

2.8982 

13 

0.6884 

1.3304 

1.7341 

2.1009 

2.5524 

2.8784 

19 

0.6876 

1.3277 

1.7291 

2.0930 

2.5395 

2.8609 

201 

0.6870 

1.3253 

1.7247 

2.0860 

2.5280 

2.8453 

21 

0.6864 

1.3232 

1.7207 

2.0796 

2.5177 

2.8314 

22 

0.6858 

1.3212 

1.7171 

2.0739 

2.5083 

2.8188 

23 

0.6853 

1.3195 

1.7139 

2.0687 

2.4999 

2.8073 

24 

0.6848 

1.3178 

1.7109 

2.0639 

2.4922 

2.7969 

25 

0.6844 

1.3163 

1.7081 

2.0595 

2.4851 

2.7874 

26 

0.6840 

1.3150 

1.7056 

2.0555 

2.4786 

2.7787 

27 

0.6837 

1.3137 

1.7033 

2.0518 

2.4727 

2.7707 

2S 

0.6834 

1.3125 

1.7011 

2.0484 

2.4671 

2.7633 

29 

0.6830 

1.3114 

1.6991 

2.0452 

2.4620 

2.7564 

30 

0.6828 

1.3104 

1.6973 

2.0423 

2.4573 

2.7500 

31 

0.6825 

1.3095 

1.6955 

2.039S 

2.4528 

2.7440 

32 

0.6822 

1.3086 

1.6939 

2.0369 

2.4487 

2.7385 

33 

0.6820 

1.3077 

1.6924 

2.0345 

2.4448 

2.7333 

34 

0.6818 

1.3070 

1.6909 

2.0322 

2.4411 

2.7284 

35 

0.6816 

1.3062 

1.6896 

2.0301 

2.4377 

2.7238 

36 

0.6814 

1.3055 

1.6883 

2.0281 

2.4345 

2.7195 

37 

0.6812 

1.3049 

1.6871 

2.0262 

2.4314 

2.7154 

3S 

0.6810 

1.3042 

1.6860 

2.0244 

2.4286 

2.7116 

39 

0.6808 

1.3036 

1.6849 

2.0227 

2.4258 

2.7079 

4fi 

0.6807 

1.3031 

1.6839 

2.0211 

2.4233 

2.7045 

41 

0.6805 

1.3025 

1.6829 

2.0195 

2.420* 

2.7012 

42 

0.6804 

1.3020 

1.6820 

2.0181 

2.4185 

2.6981 

43 

0.6802 

1.3016 

1.6811 

2.0167 

2.4163 

2.6951 

44 

0.6801 

1.3011 

1.6802 

2.0154 

2.4141 

2.6923 

45 

0.6800 

1.3006 

1.6794 

2.0141 

2.4121 

2.UH 

5/ 

-  Donald  B.  Owen,  HANDBOOK  OF  STATISTICAL  TABLES,  c  1962.  U.S. 
Department  of  Energy,  pp:  28-30,  519-523.  Published  by  Addi son-Wesley 
Publishing  Co.,  Inc.,  Reading,  MA.  Reprinted  with  permission  of  the 
publisher. 


41 


Table  A. 2.     Critical  values  for  student's  t-distribution  (cont.) 


f 

0.75 

0.90 

0.95 

0.975 

0.99 

0.995 

46 

0,6799 

1.3002 

1.6787 

2.0129 

2.4102 

2.6870 

47 

0.6797 

1.2998 

1.6779 

2.0117 

2.4083 

2.6846 

M 

0.6796 

1.2994 

1.6772 

2.0106 

2.4066 

2.6822 

49 

0.6795 

1.2991 

1.6766 

2.0096 

2.4049 

2.6800 

56 

0.6794 

1.2987 

1.6759 

2.0086 

2.4033 

2.6778 

51 

0.6793 

1.2984 

1.6753 

2.0076 

2.4017 

2.6757 

52 

0.6792 

1.2980 

1.6747 

2.0066 

2.4002 

2.6737 

53 

0.6791 

1.2977 

1.6741 

2.0057 

2.3988 

2.6718 

54 

0.6791 

1.2974 

1.6736 

2.0049 

2.3974 

2.6700 

55 

0.6790 

1,2971 

1.6730 

2.0040 

2.3961 

2.6682 

56 

0.6789 

1.2969 

1.6725 

2,0032 

2.3948 

2.6665 

57 

0.6788 

1.2966 

1.6720 

2.0025 

2.3936 

2.6649 

58 

0.6787 

1.2963 

1.6716 

2.0017 

2.3924 

2.6633 

59 

0.6787 

1.2961 

1.6711 

2.0010 

2.3912 

2.6618 

60 

0.6786 

1.2958 

1.6706 

2.0003 

2.3901 

2.6603 

61 

0.6785 

1.2956 

1.6702 

1.9996 

2.3890 

2.6589 

62 

0.6785 

1.2954 

1.6698 

1.9990 

2.3880 

2.6575 

63 

0.6784 

1.2951 

1.6694 

1.9983 

2.3870 

2.6561 

64 

0.6783 

1.2949 

1.6690 

1.9977 

2.3860 

2.6549 

65 

0.6783 

1.2947 

1.6686 

1.9971 

2.3851 

2.6536 

66 

0.6782 

1.2945 

1.6683 

1.9966 

2.3842 

2.6524 

67 

0.6782 

1.2943 

1.6679 

1.9960 

2.3833 

2.6512 

68 

0.6781 

1.2941 

1.6676 

1.9955 

2.3824 

2.6501 

69 

0.6781 

1.2939 

1.6672 

1.9949 

2.3816 

2.6490 

7© 

0.6780 

1.2938 

1.6669 

1.9944 

2.3808 

2.6479 

71 

0.6780 

1.2936 

1.6666 

1.9939 

2.3800 

2.6469 

72 

0.6779 

1.2934 

1.6663 

1.9935 

2.3793 

2.6459 

73 

0.6779 

1.2933 

1.6660 

1.9930 

2.3785 

2.6449 

74 

0.6778 

1.2931 

1.6657 

1.9925 

2.3778 

2.6439 

75 

0.6778 

1.2929 

1.6654 

1.9921 

2.3771 

2.6430 

76 

0.6777 

1.2928 

1.6652 

1.9917 

2.3764 

2.6421 

77 

0.6777 

1.2926 

1.6649 

1.9913 

2.3758 

2.6412 

78 

0.6776 

1.2925 

1.6646 

1.9908 

2.3751 

2.6403 

79 

0.6776 

1.2924 

1.6644 

1.9905 

2.3745 

2.6395 

SO 

0.6776 

1.2922 

1.6641 

1.9901 

2.3739 

2.6387 

81 

0.6775 

1.2921 

1.6639 

1.9897 

2.3733 

2.6379 

82 

0.6775 

1.2920 

1.6636 

1.9893 

2.3727 

2.6371 

83 

0.6775 

1.2918 

1.6634 

1.9890 

2.3721 

2.6364 

84 

0.6774 

1.2917 

1.6632 

1.9886 

2.3716 

2.6356 

85 

0,6774 

1.2916 

1.6630 

1.9883 

2.3710 

2.6349 

86 

0.6774 

1.2915 

1.6628 

1.9879 

2.3705 

2.6342 

87 

0.6773 

1.2914 

1.6626 

1.9876 

2.3700 

2.633S 

S3 

0.6773 

1.2912 

1.6624 

1.9873 

2.3695 

2.6329 

as 

0.6773 

1.2911 

1.6622 

1.9870 

2.3690 

2.6322 

M 

0.6772 

1.2910 

1.6620 

1.9867 

2.3685 

2.6316 

42 
Table  A. 2.  Critical  values  for  student's  t-distribution  (cont.). 


1 

0.75 

0.90 

0.95 

0.975 

0.99 

0r?95 

n 

0.6772 

1.2909 

1.6618 

1.9864 

2.3680 

2.6309 

n 

0.6772 

1.2908 

1.6616 

1.9861 

2.3676 

2.6303 

93 

0.6771 

1.2907 

1.6614 

1.9858 

2.3671 

2.6297 

94 

0.6771 

1.2906 

1.6612 

1.9855 

2.3667 

2.6291 

SS 

0.6771 

1.2905 

1.6611 

1.9853 

2.3662 

2.6286 

96 

0.6771 

1.2904 

1.6609 

1.9850 

2.3658 

2.6280 

97 

0.6770 

1.2903 

1.6607 

1.9847 

2.3654 

2.6275 

m 

0.6770 

1.2902 

1.6606 

1.9845 

2.3650 

2.6269 

99 

0.6770 

1.2902 

1.6604 

1.9842 

2.3646 

2.6264 

103 

0.6770 

1.2901 

1.6602 

1.9840 

2.3642 

2.6259 

102 

0.6769 

1.2899 

1.6599 

1.9835 

2.3635 

2.6249 

104 

0.6769 

1.2897 

1.6596 

1.9830 

2.3627 

2.6239 

106 

0.6768 

1.2896 

1.6594 

1.9826 

2.3620 

2.6230 

108 

0.6768 

1.2894 

1.6591 

1.9822 

2.3614 

2.6221 

110 

0.6767 

1.2893 

1.6588 

1.9818 

2.3607 

2.6213 

112 

0.6767 

1.2892 

1.6586 

1.9814 

2.3601 

2.6204 

114 

0.6766 

1.2890 

1.6583 

1.9810 

2.3595 

2.6196 

116 

0.6766 

1.2889 

1.6581 

1.9806 

2.3589 

2.6189 

113 

0.6766 

1.2888 

1.6579 

1.9803 

2.3584 

2.6181 

120 

0.6765 

1.2886 

1.6577 

1.9799 

2.3578 

2.6174 

122 

0.6765 

1.2885 

1.6574 

1.9796 

2.3573 

2.6167 

124 

0.6765 

1.2884 

1.6572 

1.9793 

2.3568 

2.6161 

126 

0.6764 

1.2883 

1.6570 

1.9790 

2.3563 

2.6154 

128 

0.6764 

1.2882 

1.6568 

1.9787 

2.3558 

2.6148 

130 

0.6764 

1.2881 

1.6567 

1.9784 

2.3554 

2.6142 

132 

0.6764 

1.2880 

1.6565 

1.9781 

2.3549 

2.6136 

134 

0.6763 

1.2879 

1.6563 

1.9778 

2.3545 

2.6130 

136 

0.6763 

1.2878 

1.6561 

1.9776 

2.3541 

2.6125 

13S 

0.6763 

1.2877 

1.6560 

1.9773 

2.3537 

2.6119 

140 

0.6762 

1.2876 

1.6558 

1.9771 

2.3533 

2.6114 

142 

0.6762 

1.2875 

1.6557 

1.9768 

2.3529 

2.6109 

144 

0.6762 

1.2875 

1.6555 

1.9766 

2.3525 

2.6104 

146 

0.6762 

1.2874 

1.6554 

1.9763 

2.3522 

2.6099 

148 

0.6762 

1.2873 

1.6552 

1.9761 

2.3518 

2.6095 

ISO 

0.6761 

1.2872 

1.6551 

1.9759 

2.3515 

2.6090 

200 

0.6757 

1.2858 

1.6525 

1.9719 

2.3451 

2.6006 

300 

0.6753 

1.2844 

1.6499 

1.9679 

2.3388 

2.5923 

400 

0.6751 

1.2837 

1.6487 

1.9659 

2.3357 

2.5882 

500 

0.6750 

1.2832 

1.6479 

1.9647 

2.3338 

2.5857 

600 

0.6749 

1.2830 

1.6474 

1.9639 

2.3326 

2.5840 

700 

0.6748 

1.2828 

1.6470 

1.9634 

2.3317 

2.5829 

800 

0.6748 

1.2826 

1.6468 

1.9629 

2.3310 

2.5820 

900 

0.6748 

1.2825 

1.6465 

1.9626 

2.3305 

2.5813 

1000 

0.6747 

1.2824 

1.6444 

1.9623 

2.3301 

2.5808 

• 

0.6745 

1.2816 

1.6449 

1.9600 

2.3263 

2.5758 

43 


• 


APPENDIX  B:  The  3-F  Erosion  Bridge — A  New  Tool  for  Measuring  Soil  Erosion 

The  following  document  is  reprinted  with  the  permission  of  the  State 
of  California,  Department  of  Forestry,  and  comprises  pages  44  -50  of  this 
Report. 


State  of  California 
The  Resources  Agency 
Department  of  Forestry 


Range  Improvement  Studies 


1416  Ninth  Street 

Sacramento,  CA  95814 

Phone  916-445-5571 


No.  23 


September  1978 


THE  3-F  EROSION  BRIDGE— A  NEW  TOOL  FOR  MEASURING  SOIL  EROSION 
Gerald  E.  Ranger  and  Franklin  F.  Frank* 


The  California  Department  of  Forestry,  through,  its  range  management  and 
watershed  protection  programs,  is  frequently  required  to  evaluate  soil 
erosion  on  watersheds  following  fire  or  other  vegetative  disturbances. 
Past  methods  used  to  obtain  these  data  were  inexact  and  produced  results 
of  questionable  validity. 

The  3-F  erosion  bridge  was  developed  to  provide  more  accurate  data.  The 
tool  iss   (1)  inexpensive;  (2)  easily  constructed;  (3)  simple  and  reli- 
able; (k)   easily  and  conveniently  carried  in  the  field;  (5)  quickly  read; 
and  (6)  effective  in  yielding  valid  data. 


Fig.  1.  Measuring  6oil  erosion  with  the  3-F  erosion  bridge. 


*  Forester  II,  Range  and  Watershed  Forester,  CDF,  Central  Coast  Region; 
State  Forest  Ranger  III  (formerly  Forester  II,  Range  and  Watershed 
Forester,  Central  Coast  Region). 


naMHonHwiauaamauiaairanauMaHa 


.   .    ..:■■■  ,::.: 


msmmmumum 


THE  TOOL 

The  bridge  is  a  slightly  modified  4-8-inch  aluminum  masonry  level 
positioned  between  two  fixed  support  pins.  This  provides  several  per- 
manent reference  points  from  which  erosion  can  be  measured  without 
disturbing  the  soil  at  the  point  of  measurement.  Vertical  and  horizontal 
level  bubbles  assure  installation  accuracy  and  are  used  to  determine 
whether  the  pins  have  been  disturbed  between  measurements. 

The  bridge  and  accessories  necessary  for  its  operation  are  described  as 
follows: 

•  3-F  erosion  bridge — 48-inch  aluminum  masonry  level  machined  to 
provide  10  vertical  measuring  holes,  a  slot  on  one  end  and  a  hole  on 
the  other  for  support.  (See  figures  3-6.) 

•  Two  steel  support  pins,  each  5/8  inch  in  diameter,  k   feet  in  length, 
sharpened  to  a  point  at  one  end  and  chamfered  1/8  inch  or  more  at 
the  other. 

•  Sledge  hammer. 

•  Metal  soil-surface-level  rod,  5/16  inch  in  diameter,  2  feet  long. 

•  Clipboard  and  appropriate  forms. 

•  Calibrated  measuring  tape. 


Fig.  2.  3-F  erosion  bridge  and  accessories. 

2 


Fig.  3»  3-F  erosion  bridge  in  place;  note  level  bubble. 


ESTABLISHMENT  AND  OPERATION 

After  selecting  an  appropriate  site,  one  support  pin  is  plumbed  using 
the  vertical  bubble  in  the  level,  and  then  driven  approximately  six  to 
eight  inches  into  the  soil.  It  is  then  replumbed  and  driven  to  a  depth 
of  approximately  three  feet,  replumbing  as  necessary  to  keep  the  pin  plumb. 

After  the  pin  is  in  place,  the  bridge  is  used  to  measure  the  point  at 
which  the  second  pin  is  to  be  driven  (approximately  h3   inches  from  the 
first  pin).  The  second  pin  is  then  driven  into  the  soil  in  the  same 
manner  as  the  first.  When  the  second  pin  is  approximately  the  same 
height  as  the  first,  the  level  is  positioned  on  top  of  the  pins  to  check 
if  they  are  level.  Adjustments  in  height  are  made  by  removing  the  level 
and  driving  the  tallest  pin. 

The  tool  is  ready  for  operation  when  the  pins  are  level  and  the  level  is 
positioned  on  the  pins  to  form  the  bridge.  Measurements  are  made  at  ten 
points  along  the  bridge  and  readings  are  recorded  on  an  erosion 
measurement  form. 

At  each  of  the  ten  points,  there  are  two  vertically  aligned  holes  through 
which  the  surface  level  rod  slides.  The  rod  is  lowered  gently  to  the 
soil  surface. 


It  is  important  that  no  downward  pressure  be  placed  on  the  rod  and  that 
it  is  not  dropped  into  position.  The  weight  of  the  rod  compresses  the 
loose  soil  and  ash  and  thereby  reduces  the  possibility  of  mistakenly 
measuring  compaction  or  settling  as  erosion. 


With  the  rod  resting  on  the  soil,  a  reading  is  made  by  placing  the 
measuring  tape  on  top  of  the  bridge  and  adjacent  to  the  rod;  the  distance 
to  the  top  of  the  rod  is  read  and  recorded.  As  subsequent  remeasurements 
are  made,  soil  movement  at  each  of  the  ten  points  can  be  calculated  by 
comparing  the  current  reading  with  preceding  ones.  If  the  current 
reading  is  less,  a  loss  of  soil  or  erosion  at  that  point  is  indicated;  a 
higher  reading  indicates  soil  deposition  at  that  point. 

To  find  the  average  soil  movement  at  a  3-F  erosion  bridge  station,  the 
calculated  differences  at  all  points  are  algebraically  added  and  the 
total  divided  by  ten. 


DISCUSSION 

While  use  of  the  bridge  is  relatively  easy,  some  problems  can  be  expected 
if  proper  equipment  and  techniques  are  not  used. 

Field  testing  of  various  diameters  and  lengths  of  support  pins  showed 
that  the  5/8  inch  by  k   foot  size  was  the  minimum  satisfactory  size. 
Thinner  or  shorter  pins,  although  lighter  to  carry,  proved  unsatisfactory 
because  they  were  easily  moved  out  of  position  due  to  soil  expansion  or 
animal  disturbances. 

Mushrooming  of  the  ends  of  the  support  pins  can  be  a  problem  unless  they 
are  chamfered  more  than  1/8  inch  or  a  driving  collar  is  used. 


Fig.  k.     Underside  of  3-F  erosion  bridge  showing  hole  for  5/8  inch 
support  pin  and  3/^6  inch  alignment  holes  for  metal  rod. 


I 


hSw**^  --* 


^■MEQL- :  -.--•.' *>-^ 


Fig.  5*  3-F  erosion  bridge  in  place. 


The  alignment  and  spacing  of  the  support  pins  can  be  difficult  unless 
care  is  used  when  driving  them.  The  pins  should  be  checked  and  plumbed 
frequently  as  they  are  driven.  Rocky  soils  can  present  problems  because 
the  pins  are  deflected  by  the  rocks  and  their  proper  spacing  cannot  be 
maintained-  Sometimes  the  pins  cannot  be  driven  in  far  enough  because 
they  hit  large  rocks. 

On  soils  with  high  coefficients  of  expansion  (clogs),  care  should  be 
taken  to  be  sure  pins  have  not  moved  between  measurements.  It  is  also 
important  to  make  measurements  when  soil  moisture  content  is  about  equal 
to  that  of  the  previous  measurements. 

It  is  important  that  the  second  pin  be  started  exactly  k3   inches  from 
the  first;  otherwise  the  pins  may  have  to  be  bent  to  accomodate  the 
erosion  bridge.  Although  a  tolerance  of  %   inch  in  either  direction  is 
allowed  by  the  slot  in  the  bridge,  it  is  difficult  to  achieve  these 
tolerances  unless  the  pins  are  carefully  driven. 

Field  testing  has  shown  that  a  two-man  team  can  establish  an  erosion 
station  in  approximately  15  minutes,  with  the  actual  measurements  taking 
about  two  minutes.  The  accuracy  of  the  measurements  varies  with  the 
graduations  on  the  measuring  tape.  A  metric  tape  can  be  accurately  read 
to  1  millimeter.  A  tape  graduated  in  0.01  foot  can  be  accurately  read 
to  0.005  foot. 


SUMMARY  AND  CONCLUSIONS 

The  3-F  erosion  bridge  is  an  effective  tool  for  obtaining  meaningful 
soil  erosion  data  on  burned  and  denuded  watersheds.  It  is  a  simple, 
inexpensive,  reliable,  and  accurate  instrument. 

Field  work  has  shown  that  the  bridge  can  yield  valid  data  for  determining 
the  relative  magnitude  of  on-site,  sheet-type  soil  movement  and  is 
especially  useful  when  studying  the  effects  of  vegetative  cover  on  the 
amount  of  soil  movement  after  fire.  To  estimate  total  soil  movement, 
the  3-F  erosion  bridge  must  be  used  in  combination  with  other  techniques 
for  measuring  gully  and  streambank  erosion. 

Although  the  data  obtained  is  accurate  for  each  point  measured,  care 
should  be  taken  in  how  data  is  interpreted  and  the  conclusions  drawn 
from  it. 

ACKNOWLEDGMENTS 

The  authors  wish  to  express  their  appreciation  to  the  following  California 
Department  of  Forestry  employees:  Patrick  Fay,  Equipment  Maintenance 
Supervisor,  and  Brad  Mailahn,  Ecology  Corpsman,  for  their  guidance  and 
assistance  in  modifying  the  erosion  bridge;  and  to  Harry  Harp,  Information 
Officer,  for  his  photographic  contributions. 


Fig.  6.  Underside  of  3-F  erosion  bridge  showing  oversized  slot  for  5/8 

inch  support  pin.  The  slot  allows  about  iz   inch  tolerance  in  each 
direction  for  ease  in  placing  level  on  the  second  support  pin. 


:',':"■  '':'^'m 


Fig.  7.  Clipboard  and  sample  recording  form. 


- 


Fig.  8.  Measuring  soil  erosion  on  point  number  3. 


-  „, 


5i 


APPENDIX  C:  Erosion  Bridge  Construction 

Schematic  diagrams  (Figures  C.l  and  C.2)  show  features  of  the  erosion 
bridge  and  measuring  rod  described  in  the  paper  by  Ranger  and  Frank 
(1978). 


immiu  mil  iiiminit 


b2 


TOP  VIEW 


BOTTOM  VIEW 

\  Measuring  hole 

^Support  hole 

0 

o 

o       o 

0 

o 

6         O         ; 

Support  slot  ■** 

A 
l 

1               1                II 

SIDE  VIEW 
-I r-1 n r 

3 

!—+ LJ~ 

II 

!'' 

i> 

1  1 

|l 

III 

■ ,  i 

U.l. l—l 1 

o 

i               ii               ii               i 

o 

-I r-i j—| r 

1                          1       1                           11                            11 

o     1  | 

1            I 
i            i — i            ii             ii — 1             J ' 

A' 

l—l l_l 1 

ii               i   i 

3' 

Cross  section 

A -A' 

through  support 

hole 


J 1   1 


Cross  section 

B-B' 

through  measuring 

hole 


Fioure  C.l.     Erosi«n  !>ri<*ne. 


53 


• 


i   i 


EXPANDED  VIEW 

Shorter  mark  - 1  mm 

interval 
longer  mark  - 1  cm 

interval 


^— = 

2-= 

si 
.6i 

8-= 
%*1 


Figure  C.2.  Measuring  rod. 


54 

APPENDIX  D:  Stratified  Sampling 

Suppose  you  have  divided  the  project  area  into  k  different  strata. 
For  each  individual  stratum,  apply  the  procedures  developed  in  the  text 
giving  the  following  terms: 


A-  =    total  size  of  the  jth  stratum  (acres); 

J 


P.  =    proportion  of  the  area  that  is  soil  in  the  jth  stratum; 

J 


3 
p  =    bulk  density  of  the  soil  in  the  jth  stratum  (g/cm  ); 

J 


d.  =    value  of  d  for  the  jth  stratum  (computed  with  n-  psu's); 

J  J 


S  (d.)  =    variance  of  d,; 


where  j  =  1,  2,  .  .  .,  k.  Note  that  the  following  must  hold 


and 


A  =  E  A.  =  total  acres  in  the  project  area, 
J-l  J 


k 

n  =  £  n-  =  total  number  of  psu's  taken  on  the  project 
j=l  J   area. 


55 


f 


Our  estimates  of  d,  r,  and  M  are  given  by 


and 


d  =  s  n  .  *  d  ./n. 


r  =  (113.31/At)  *  Z     p,  *  n.  *  d./n, 

■_1  J    J    J 


and 


J-l 

The  variances  of  these  terms  are  given  by 
k 
S2(d).M j- 


k 

F  . 
J    J    J    J    J 


M  =  (113.31)  *  z  P.  *  Ai  *  P  .  *  n  .  *  d./n, 


z    "j*s2<3j) 


n 


k 
(113.31/At)2  *  *  (p  *  n  )2  *  S2(d  ) 

?    =  i-1    J     J  J 

S2(r) — J=±-2 — 


and 


(113. 31)2  *  z     (P.  *  A-  *  p.-  *  n,)2  *  S2(di) 
S*(M)  = J=± g    — 


n 


Runoff  Plots 

near 

Havre ,  Montana 


The  collection  tank  is  a  100  gallon  oval  stock  watering 
trough.  Water  level  in  the  tank  is  recorded  by  a  mechanical 
float  counter  which  will  measure  cummulative  increases  in  stage. 
Decreasing  water  level  due  to  evaporation  will  not  affect  the 
readings.   The  counter  is  designed  to  be  read  yearly,  multiplied 
by  the  area  of  the  tank  to  get  an  annual  runoff  amount,  after 
precipitation  has  been  subtracted.   Sediment  yield  will  also 
be  measured  at  this  time. 

It  is  hoped  that  the  installation  of  these  plots  and  like 
plots  at  three  other  BLM  Districts  will  demonstrate  the  effect- 
iveness of  this  system  and  suggest  improvements  to  it.   It  is 
anticipated  that  these  plots  may  provide  an  efficient  tool  to 
the  land  manager  concerned  with  describing  hydrologic  conditions 
of  range  lands . 


ATTACHMENT  A 
Runoff  Plots  near  Havre,  Montana 


rf 


^a^^:|p>fff^V!#L■■'^■^■■^"''-^''*''^','''^  ^  '      ^  ^        "^    ^    ^      T^^ 


^:-:;^.-^--;j? '■■-.. -V 


• 


Hfflgj^T——— «■"«"— *™"—»^ — — — 


i_i  2  •      <£>   O 

QT1 

e  0 

8b 

c 

io  on  S  «)  ro 

u 

00  -J   -P»   i-1 

c 

r 

Ln          ~-j 

1 

0  D 

>> 

£  to 

00  ""J 

r 
S 

ZH 

"? 

n  w 

a 

—  ■          ■— 

W 

—  1 

" 

! 

j 

o 

I— 1 

c- 

VJ 

3S 

> 

m 

H 

o 
a 
a 
o 

03 

o 

1 

> 
i— i 

XI 

o 

no 

po 

m 

. 

XI 

% 

o 

en 

PO 

n 

?=; 

> 

en 

S3 

i 

o 

- 

T3 

O 

"=1 

fi 

o 

o 

w 

0 

J 

3 

1 

5 

" 

0 

a 

ft 

P. 

3a 

j 

0 

p 

^ 

n> 

z  M 

< 

w 

ft 

0 

r» 

It 

^  1 

Bureau  ot  tana  Management 

Library 

B!dg.  50,  Denver  Federal  Center 

Denver,  CO  80225  -- 


. 


% 


